{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "opened-gasoline",
   "metadata": {},
   "source": [
    "# `clean_ml()`: Clean dataset for downstreaming machine learning tasks."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "institutional-humidity",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "The function `clean_ml()` cleans a dataset for downstreaming machine learning tasks with commonly used operators. It deals with categrical columns and numerical columns sperately. We set the default cleaning pipeline according to existing tools. \n",
    "\n",
    "Currently, the supported components and operators are listed below:\n",
    "\n",
    "* `cat_encoding`: encoding categrical columns\n",
    "    * no_encoding\n",
    "    * one_hot\n",
    "* `cat_imputation`: imputing missing values in categorical columns\n",
    "    * constant\n",
    "    * most_frequent\n",
    "    * drop\n",
    "* `num_imputataion`\t: imputing missing values in numerical columns\n",
    "    * mean\n",
    "    * median\n",
    "    * most_frequent\n",
    "    * drop\n",
    "* `num_scaling`: scaling numerical columns\n",
    "    * standarize\n",
    "    * minmax\n",
    "    * maxabs\n",
    "* `variance_threshold`: dropping numerical columns with low variance\n",
    "\n",
    "Users can also specify `include_operators` and `exclude_operators` to include or exclude specified operators listed above. User can also customize the pipeline with user-defined operators. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "increasing-harmony",
   "metadata": {},
   "source": [
    "### An example dataset\n",
    "The example dataset is a very traditional dataset [adult](https://archive.ics.uci.edu/ml/datasets/adult). It has 48842 rows and 15 columns. In this dataset, '?' means the missing values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "present-remains",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>77516</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>83311</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>215646</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>234721</td>\n",
       "      <td>11th</td>\n",
       "      <td>7</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>338409</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>284582</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>160187</td>\n",
       "      <td>9th</td>\n",
       "      <td>5</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>209642</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>45781</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>159449</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>280464</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>1</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>141297</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>122272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>205019</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>121772</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>11</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>224655</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>247547</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>11</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>292710</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>173449</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>285570</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>89686</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>440129</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>350977</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>3</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>349230</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>245211</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>215419</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>4</td>\n",
       "      <td>?</td>\n",
       "      <td>321403</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>?</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>374983</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>83891</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>1</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>182148</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>48842 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       age         workclass  fnlwgt     education  education-num  \\\n",
       "0        2         State-gov   77516     Bachelors             13   \n",
       "1        3  Self-emp-not-inc   83311     Bachelors             13   \n",
       "2        2           Private  215646       HS-grad              9   \n",
       "3        3           Private  234721          11th              7   \n",
       "4        1           Private  338409     Bachelors             13   \n",
       "5        2           Private  284582       Masters             14   \n",
       "6        3           Private  160187           9th              5   \n",
       "7        3  Self-emp-not-inc  209642       HS-grad              9   \n",
       "8        1           Private   45781       Masters             14   \n",
       "9        2           Private  159449     Bachelors             13   \n",
       "10       2           Private  280464  Some-college             10   \n",
       "11       1         State-gov  141297     Bachelors             13   \n",
       "12       0           Private  122272     Bachelors             13   \n",
       "13       1           Private  205019    Assoc-acdm             12   \n",
       "14       2           Private  121772     Assoc-voc             11   \n",
       "...    ...               ...     ...           ...            ...   \n",
       "48827    3           Private  224655       HS-grad              9   \n",
       "48828    2           Private  247547     Assoc-voc             11   \n",
       "48829    4           Private  292710    Assoc-acdm             12   \n",
       "48830    1           Private  173449       HS-grad              9   \n",
       "48831    3           Private  285570       HS-grad              9   \n",
       "48832    4           Private   89686       HS-grad              9   \n",
       "48833    1           Private  440129       HS-grad              9   \n",
       "48834    0           Private  350977       HS-grad              9   \n",
       "48835    3         Local-gov  349230       Masters             14   \n",
       "48836    1           Private  245211     Bachelors             13   \n",
       "48837    2           Private  215419     Bachelors             13   \n",
       "48838    4                 ?  321403       HS-grad              9   \n",
       "48839    2           Private  374983     Bachelors             13   \n",
       "48840    2           Private   83891     Bachelors             13   \n",
       "48841    1      Self-emp-inc  182148     Bachelors             13   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed                  ?  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male            1            0             2   \n",
       "1                   White    Male            0            0             0   \n",
       "2                   White    Male            0            0             2   \n",
       "3                   Black    Male            0            0             2   \n",
       "4                   Black  Female            0            0             2   \n",
       "5                   White  Female            0            0             2   \n",
       "6                   Black  Female            0            0             0   \n",
       "7                   White    Male            0            0             2   \n",
       "8                   White  Female            4            0             3   \n",
       "9                   White    Male            2            0             2   \n",
       "10                  Black    Male            0            0             4   \n",
       "11     Asian-Pac-Islander    Male            0            0             2   \n",
       "12                  White  Female            0            0             1   \n",
       "13                  Black    Male            0            0             3   \n",
       "14     Asian-Pac-Islander    Male            0            0             2   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female            0            0             1   \n",
       "48828               Black  Female            0            0             2   \n",
       "48829               White    Male            0            0             2   \n",
       "48830               White    Male            0            0             2   \n",
       "48831               White    Male            0            0             2   \n",
       "48832               White    Male            0            0             3   \n",
       "48833               White    Male            0            0             2   \n",
       "48834               White  Female            0            0             2   \n",
       "48835               White    Male            0            0             2   \n",
       "48836               White    Male            0            0             2   \n",
       "48837               White  Female            0            0             2   \n",
       "48838               Black    Male            0            0             2   \n",
       "48839               White    Male            0            0             3   \n",
       "48840  Asian-Pac-Islander    Male            2            0             2   \n",
       "48841               White    Male            0            0             3   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14                 ?   >50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[48842 rows x 15 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "pd.set_option('display.min_rows', 30)\n",
    "df = pd.read_csv('adult.csv')\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "boolean-patient",
   "metadata": {},
   "source": [
    "### Split the dataset as training dataframe and test dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "processed-inspiration",
   "metadata": {},
   "outputs": [],
   "source": [
    "training_rate = 0.7\n",
    "index = df.index\n",
    "number_of_rows = len(index)\n",
    "training_df = df.iloc[:int(training_rate * number_of_rows), :]\n",
    "test_df = df.iloc[int(training_rate * number_of_rows):, :]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "certified-setup",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>77516</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>83311</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>215646</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>234721</td>\n",
       "      <td>11th</td>\n",
       "      <td>7</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>338409</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>284582</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>160187</td>\n",
       "      <td>9th</td>\n",
       "      <td>5</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>3</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>209642</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>45781</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>159449</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>280464</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>1</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>141297</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>122272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>205019</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>121772</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>11</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>173651</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>149337</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>146674</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>173483</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>223669</td>\n",
       "      <td>11th</td>\n",
       "      <td>7</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>182177</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>109414</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>3</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>150917</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>4</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>39128</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>3</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>103540</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>110212</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>222450</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>0</td>\n",
       "      <td>?</td>\n",
       "      <td>113760</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>2</td>\n",
       "      <td>?</td>\n",
       "      <td>253717</td>\n",
       "      <td>11th</td>\n",
       "      <td>7</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>?</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>306908</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       age         workclass  fnlwgt     education  education-num  \\\n",
       "0        2         State-gov   77516     Bachelors             13   \n",
       "1        3  Self-emp-not-inc   83311     Bachelors             13   \n",
       "2        2           Private  215646       HS-grad              9   \n",
       "3        3           Private  234721          11th              7   \n",
       "4        1           Private  338409     Bachelors             13   \n",
       "5        2           Private  284582       Masters             14   \n",
       "6        3           Private  160187           9th              5   \n",
       "7        3  Self-emp-not-inc  209642       HS-grad              9   \n",
       "8        1           Private   45781       Masters             14   \n",
       "9        2           Private  159449     Bachelors             13   \n",
       "10       2           Private  280464  Some-college             10   \n",
       "11       1         State-gov  141297     Bachelors             13   \n",
       "12       0           Private  122272     Bachelors             13   \n",
       "13       1           Private  205019    Assoc-acdm             12   \n",
       "14       2           Private  121772     Assoc-voc             11   \n",
       "...    ...               ...     ...           ...            ...   \n",
       "34174    2           Private  173651  Some-college             10   \n",
       "34175    3           Private  149337       HS-grad              9   \n",
       "34176    4           Private  146674       HS-grad              9   \n",
       "34177    4           Private  173483     Bachelors             13   \n",
       "34178    0           Private  223669          11th              7   \n",
       "34179    3           Private  182177  Some-college             10   \n",
       "34180    0           Private  109414  Some-college             10   \n",
       "34181    3      Self-emp-inc  150917  Some-college             10   \n",
       "34182    4  Self-emp-not-inc   39128       HS-grad              9   \n",
       "34183    3         Local-gov  103540     Bachelors             13   \n",
       "34184    4           Private  110212       HS-grad              9   \n",
       "34185    2           Private  222450       HS-grad              9   \n",
       "34186    0                 ?  113760       HS-grad              9   \n",
       "34187    2                 ?  253717          11th              7   \n",
       "34188    0           Private  306908       HS-grad              9   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married                  ?       Own-child   \n",
       "34187     Married-civ-spouse                  ?            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male            1            0             2   \n",
       "1                   White    Male            0            0             0   \n",
       "2                   White    Male            0            0             2   \n",
       "3                   Black    Male            0            0             2   \n",
       "4                   Black  Female            0            0             2   \n",
       "5                   White  Female            0            0             2   \n",
       "6                   Black  Female            0            0             0   \n",
       "7                   White    Male            0            0             2   \n",
       "8                   White  Female            4            0             3   \n",
       "9                   White    Male            2            0             2   \n",
       "10                  Black    Male            0            0             4   \n",
       "11     Asian-Pac-Islander    Male            0            0             2   \n",
       "12                  White  Female            0            0             1   \n",
       "13                  Black    Male            0            0             3   \n",
       "14     Asian-Pac-Islander    Male            0            0             2   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male            0            0             2   \n",
       "34175               White    Male            0            0             2   \n",
       "34176               Black    Male            0            0             2   \n",
       "34177               White  Female            0            0             1   \n",
       "34178               White    Male            0            0             0   \n",
       "34179               White  Female            0            0             1   \n",
       "34180  Asian-Pac-Islander    Male            0            0             0   \n",
       "34181               White    Male            0            3             2   \n",
       "34182               White    Male            0            0             1   \n",
       "34183               White    Male            0            0             3   \n",
       "34184               Black    Male            0            0             2   \n",
       "34185               White    Male            0            4             2   \n",
       "34186               White  Female            0            0             2   \n",
       "34187               White  Female            0            0             0   \n",
       "34188               White    Male            0            0             2   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14                 ?   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176              ?   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "sunrise-passing",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>2</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>263871</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>2</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>55294</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>174063</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>11</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>3</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>258735</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>275867</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>154235</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>1</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>210448</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>337908</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>1</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>205333</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>187447</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>153589</td>\n",
       "      <td>9th</td>\n",
       "      <td>5</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>1</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>149988</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>398959</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>0</td>\n",
       "      <td>?</td>\n",
       "      <td>194096</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>10</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>44041</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>224655</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>247547</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>11</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>292710</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>12</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>173449</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>3</td>\n",
       "      <td>Private</td>\n",
       "      <td>285570</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>4</td>\n",
       "      <td>Private</td>\n",
       "      <td>89686</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>440129</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>0</td>\n",
       "      <td>Private</td>\n",
       "      <td>350977</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>3</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>349230</td>\n",
       "      <td>Masters</td>\n",
       "      <td>14</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>1</td>\n",
       "      <td>Private</td>\n",
       "      <td>245211</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>215419</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>4</td>\n",
       "      <td>?</td>\n",
       "      <td>321403</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>9</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>?</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>374983</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>2</td>\n",
       "      <td>Private</td>\n",
       "      <td>83891</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>1</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>182148</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>13</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       age         workclass  fnlwgt     education  education-num  \\\n",
       "34189    2  Self-emp-not-inc  263871  Some-college             10   \n",
       "34190    2         State-gov   55294     Bachelors             13   \n",
       "34191    0           Private  174063     Assoc-voc             11   \n",
       "34192    3         State-gov  258735  Some-college             10   \n",
       "34193    3           Private  275867       HS-grad              9   \n",
       "34194    0           Private  154235  Some-college             10   \n",
       "34195    1         Local-gov  210448  Some-college             10   \n",
       "34196    1           Private  337908  Some-college             10   \n",
       "34197    1         State-gov  205333     Bachelors             13   \n",
       "34198    0           Private  187447  Some-college             10   \n",
       "34199    1           Private  153589           9th              5   \n",
       "34200    1         Local-gov  149988  Some-college             10   \n",
       "34201    2           Private  398959  Some-college             10   \n",
       "34202    0                 ?  194096  Some-college             10   \n",
       "34203    2           Private   44041    Assoc-acdm             12   \n",
       "...    ...               ...     ...           ...            ...   \n",
       "48827    3           Private  224655       HS-grad              9   \n",
       "48828    2           Private  247547     Assoc-voc             11   \n",
       "48829    4           Private  292710    Assoc-acdm             12   \n",
       "48830    1           Private  173449       HS-grad              9   \n",
       "48831    3           Private  285570       HS-grad              9   \n",
       "48832    4           Private   89686       HS-grad              9   \n",
       "48833    1           Private  440129       HS-grad              9   \n",
       "48834    0           Private  350977       HS-grad              9   \n",
       "48835    3         Local-gov  349230       Masters             14   \n",
       "48836    1           Private  245211     Bachelors             13   \n",
       "48837    2           Private  215419     Bachelors             13   \n",
       "48838    4                 ?  321403       HS-grad              9   \n",
       "48839    2           Private  374983     Bachelors             13   \n",
       "48840    2           Private   83891     Bachelors             13   \n",
       "48841    1      Self-emp-inc  182148     Bachelors             13   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married                  ?       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed                  ?  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male            0            0             3   \n",
       "34190               White    Male            0            0             2   \n",
       "34191               White  Female            0            0             0   \n",
       "34192               White    Male            0            0             2   \n",
       "34193               White    Male            0            0             2   \n",
       "34194               White  Female            0            0             1   \n",
       "34195               White    Male            0            0             2   \n",
       "34196               Black  Female            0            0             1   \n",
       "34197               White  Female            0            0             0   \n",
       "34198               White    Male            0            0             2   \n",
       "34199               White    Male            0            0             2   \n",
       "34200               White  Female            0            0             2   \n",
       "34201               White    Male            0            0             2   \n",
       "34202               White  Female            0            0             1   \n",
       "34203               White    Male            0            0             3   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female            0            0             1   \n",
       "48828               Black  Female            0            0             2   \n",
       "48829               White    Male            0            0             2   \n",
       "48830               White    Male            0            0             2   \n",
       "48831               White    Male            0            0             2   \n",
       "48832               White    Male            0            0             3   \n",
       "48833               White    Male            0            0             2   \n",
       "48834               White  Female            0            0             2   \n",
       "48835               White    Male            0            0             2   \n",
       "48836               White    Male            0            0             2   \n",
       "48837               White  Female            0            0             2   \n",
       "48838               Black    Male            0            0             2   \n",
       "48839               White    Male            0            0             3   \n",
       "48840  Asian-Pac-Islander    Male            2            0             2   \n",
       "48841               White    Male            0            0             3   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "circular-blast",
   "metadata": {},
   "source": [
    "## 1. Default `clean_ml()`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "automotive-animation",
   "metadata": {},
   "source": [
    "By default, the cleaning pipeline of `clean_ml()` function:\n",
    "* For categorical columns: `constant imputation -> one-hot encoding`\n",
    "* For numerical columns: `mean imputation -> standardzation`\n",
    "\n",
    "The default NULL values are: `{np.nan, float(\"NaN\"), \"#N/A\", \"#N/A N/A\", \"#NA\", \"-1.#IND\", \"-1.#QNAN\", \"-NaN\", \"-nan\", \"1.#IND\", \"1.#QNAN\", \"<NA>\", \"N/A\", \"NA\", \"NULL\", \"NaN\", \"n/a\", \"nan\", \"null\", \"\", None}`\n",
    "\n",
    "The default filling value for categorical columns is 'missing_value'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "organized-wrestling",
   "metadata": {},
   "outputs": [],
   "source": [
    "from dataprep.clean import clean_ml\n",
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "filled-entertainment",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age                                      workclass    fnlwgt  \\\n",
       "0      0.181564  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.064247   \n",
       "1      0.955953  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.009237   \n",
       "2      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.246964   \n",
       "3      0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.428035   \n",
       "4     -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.412302   \n",
       "5      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.901345   \n",
       "6      0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.279485   \n",
       "7      0.955953  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.189970   \n",
       "8     -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.365494   \n",
       "9      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.286491   \n",
       "10     0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.862254   \n",
       "11    -0.592825  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.458800   \n",
       "12    -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.639397   \n",
       "13    -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.146086   \n",
       "14     0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.644143   \n",
       "...         ...                                            ...       ...   \n",
       "34174  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.151677   \n",
       "34175  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.382480   \n",
       "34176  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.407759   \n",
       "34177  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153272   \n",
       "34178 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.323123   \n",
       "34179  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.070743   \n",
       "34180 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.761452   \n",
       "34181  0.955953  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.367482   \n",
       "34182  1.730342  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.428648   \n",
       "34183  0.955953  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.817212   \n",
       "34184  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.753877   \n",
       "34185  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.311551   \n",
       "34186 -1.367214  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0] -0.720198   \n",
       "34187  0.181564  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.608356   \n",
       "34188 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.113276   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "2      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "4      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "5      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "10     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "11     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "...                                                  ...            ...   \n",
       "34174  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34175  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34178  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34181  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34185  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34186  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "\n",
       "                            marital-status  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "6      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34175  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34177  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34179  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34180  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34187  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "4      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "34174  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34175  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34177  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34178  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34182  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34183  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34184  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34186  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34187  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34188  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "4      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "5      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "6      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "12     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34178  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34186  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "       capitalgain  capitalloss  hoursperweek  \\\n",
       "0         1.054765    -0.206016      0.053470   \n",
       "1        -0.271118    -0.206016     -2.185441   \n",
       "2        -0.271118    -0.206016      0.053470   \n",
       "3        -0.271118    -0.206016      0.053470   \n",
       "4        -0.271118    -0.206016      0.053470   \n",
       "5        -0.271118    -0.206016      0.053470   \n",
       "6        -0.271118    -0.206016     -2.185441   \n",
       "7        -0.271118    -0.206016      0.053470   \n",
       "8         5.032415    -0.206016      1.172925   \n",
       "9         2.380648    -0.206016      0.053470   \n",
       "10       -0.271118    -0.206016      2.292380   \n",
       "11       -0.271118    -0.206016      0.053470   \n",
       "12       -0.271118    -0.206016     -1.065986   \n",
       "13       -0.271118    -0.206016      1.172925   \n",
       "14       -0.271118    -0.206016      0.053470   \n",
       "...            ...          ...           ...   \n",
       "34174    -0.271118    -0.206016      0.053470   \n",
       "34175    -0.271118    -0.206016      0.053470   \n",
       "34176    -0.271118    -0.206016      0.053470   \n",
       "34177    -0.271118    -0.206016     -1.065986   \n",
       "34178    -0.271118    -0.206016     -2.185441   \n",
       "34179    -0.271118    -0.206016     -1.065986   \n",
       "34180    -0.271118    -0.206016     -2.185441   \n",
       "34181    -0.271118     5.199568      0.053470   \n",
       "34182    -0.271118    -0.206016     -1.065986   \n",
       "34183    -0.271118    -0.206016      1.172925   \n",
       "34184    -0.271118    -0.206016      0.053470   \n",
       "34185    -0.271118     7.001429      0.053470   \n",
       "34186    -0.271118    -0.206016      0.053470   \n",
       "34187    -0.271118    -0.206016     -2.185441   \n",
       "34188    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "                                          native-country  class  \n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "3      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "5      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "6      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "7      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "10     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "14     [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "...                                                  ...    ...  \n",
       "34174  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34179  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34180  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34181  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34182  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34184  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34187  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34188  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "proud-surgery",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age                                      workclass    fnlwgt  \\\n",
       "34189  0.181564  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.704744   \n",
       "34190  0.181564  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.275191   \n",
       "34191 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.147766   \n",
       "34192  0.955953  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.655990   \n",
       "34193  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.818617   \n",
       "34194 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.335985   \n",
       "34195 -0.592825  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  0.197621   \n",
       "34196 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.407546   \n",
       "34197 -0.592825  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.149067   \n",
       "34198 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.020718   \n",
       "34199 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.342117   \n",
       "34200 -0.592825  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.376300   \n",
       "34201  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.987078   \n",
       "34202 -1.367214  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.042399   \n",
       "34203  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.382011   \n",
       "...         ...                                            ...       ...   \n",
       "48827  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.332483   \n",
       "48828  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.549787   \n",
       "48829  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.978500   \n",
       "48830 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153595   \n",
       "48831  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.910723   \n",
       "48832  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.948722   \n",
       "48833 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  2.377888   \n",
       "48834 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.531605   \n",
       "48835  0.955953  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  1.515021   \n",
       "48836 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.527612   \n",
       "48837  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.244809   \n",
       "48838  1.730342  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  1.250871   \n",
       "48839  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.759484   \n",
       "48840  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.003732   \n",
       "48841 -0.592825  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.071019   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "34192  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "...                                                  ...            ...   \n",
       "48827  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "48829  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48834  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48835  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48838  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "\n",
       "                            marital-status  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34195  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34196  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34200  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34203  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48829  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48835  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48837  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48840  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34190  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34193  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34197  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48829  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48830  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48832  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "48833  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "48834  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48835  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48837  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48839  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34191  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34194  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34198  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34202  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48834  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48840  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "       capitalgain  capitalloss  hoursperweek  \\\n",
       "34189    -0.271118    -0.206016      1.172925   \n",
       "34190    -0.271118    -0.206016      0.053470   \n",
       "34191    -0.271118    -0.206016     -2.185441   \n",
       "34192    -0.271118    -0.206016      0.053470   \n",
       "34193    -0.271118    -0.206016      0.053470   \n",
       "34194    -0.271118    -0.206016     -1.065986   \n",
       "34195    -0.271118    -0.206016      0.053470   \n",
       "34196    -0.271118    -0.206016     -1.065986   \n",
       "34197    -0.271118    -0.206016     -2.185441   \n",
       "34198    -0.271118    -0.206016      0.053470   \n",
       "34199    -0.271118    -0.206016      0.053470   \n",
       "34200    -0.271118    -0.206016      0.053470   \n",
       "34201    -0.271118    -0.206016      0.053470   \n",
       "34202    -0.271118    -0.206016     -1.065986   \n",
       "34203    -0.271118    -0.206016      1.172925   \n",
       "...            ...          ...           ...   \n",
       "48827    -0.271118    -0.206016     -1.065986   \n",
       "48828    -0.271118    -0.206016      0.053470   \n",
       "48829    -0.271118    -0.206016      0.053470   \n",
       "48830    -0.271118    -0.206016      0.053470   \n",
       "48831    -0.271118    -0.206016      0.053470   \n",
       "48832    -0.271118    -0.206016      1.172925   \n",
       "48833    -0.271118    -0.206016      0.053470   \n",
       "48834    -0.271118    -0.206016      0.053470   \n",
       "48835    -0.271118    -0.206016      0.053470   \n",
       "48836    -0.271118    -0.206016      0.053470   \n",
       "48837    -0.271118    -0.206016      0.053470   \n",
       "48838    -0.271118    -0.206016      0.053470   \n",
       "48839    -0.271118    -0.206016      1.172925   \n",
       "48840     2.380648    -0.206016      0.053470   \n",
       "48841    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "                                          native-country  class  \n",
       "34189  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34192  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34193  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34195  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34198  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34201  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "...                                                  ...    ...  \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48830  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48832  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48833  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48838  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "detailed-marijuana",
   "metadata": {},
   "source": [
    "## 2. `cat_imputation` and `cat_null_value`  parameter\n",
    "There are three choices for `cat_imputation` parameter:\n",
    "* `constant`: filling the missing value with constant values. The default is 'missing_value'.\n",
    "* `most_frequent`:  filling the missing value with most frequent value of this column.\n",
    "* `drop`: drop this column if there are missing values.\n",
    "\n",
    "`cat_null_value` parameter is a list including user-specified null values. The element in this list can be any type. For example:\n",
    "* ['?']\n",
    "* ['abc', np.nan, '?', 1265]\n",
    "\n",
    "By default, the specified missing values are replaced by \"missing_value\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "invalid-insured",
   "metadata": {},
   "source": [
    "#### cat_imputation = \"constant\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "wooden-pavilion",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                cat_imputation=\"constant\", \n",
    "                                                cat_encoding=\"no_encoding\", cat_null_value=['?'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "limiting-desktop",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.181564         State-gov -1.064247     Bachelors       1.132573   \n",
       "1      0.955953  Self-emp-not-inc -1.009237     Bachelors       1.132573   \n",
       "2      0.181564           Private  0.246964       HS-grad      -0.417870   \n",
       "3      0.955953           Private  0.428035          11th      -1.193092   \n",
       "4     -0.592825           Private  1.412302     Bachelors       1.132573   \n",
       "5      0.181564           Private  0.901345       Masters       1.520184   \n",
       "6      0.955953           Private -0.279485           9th      -1.968313   \n",
       "7      0.955953  Self-emp-not-inc  0.189970       HS-grad      -0.417870   \n",
       "8     -0.592825           Private -1.365494       Masters       1.520184   \n",
       "9      0.181564           Private -0.286491     Bachelors       1.132573   \n",
       "10     0.181564           Private  0.862254  Some-college      -0.030259   \n",
       "11    -0.592825         State-gov -0.458800     Bachelors       1.132573   \n",
       "12    -1.367214           Private -0.639397     Bachelors       1.132573   \n",
       "13    -0.592825           Private  0.146086    Assoc-acdm       0.744962   \n",
       "14     0.181564           Private -0.644143     Assoc-voc       0.357352   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "34174  0.181564           Private -0.151677  Some-college      -0.030259   \n",
       "34175  0.955953           Private -0.382480       HS-grad      -0.417870   \n",
       "34176  1.730342           Private -0.407759       HS-grad      -0.417870   \n",
       "34177  1.730342           Private -0.153272     Bachelors       1.132573   \n",
       "34178 -1.367214           Private  0.323123          11th      -1.193092   \n",
       "34179  0.955953           Private -0.070743  Some-college      -0.030259   \n",
       "34180 -1.367214           Private -0.761452  Some-college      -0.030259   \n",
       "34181  0.955953      Self-emp-inc -0.367482  Some-college      -0.030259   \n",
       "34182  1.730342  Self-emp-not-inc -1.428648       HS-grad      -0.417870   \n",
       "34183  0.955953         Local-gov -0.817212     Bachelors       1.132573   \n",
       "34184  1.730342           Private -0.753877       HS-grad      -0.417870   \n",
       "34185  0.181564           Private  0.311551       HS-grad      -0.417870   \n",
       "34186 -1.367214     missing_value -0.720198       HS-grad      -0.417870   \n",
       "34187  0.181564     missing_value  0.608356          11th      -1.193092   \n",
       "34188 -1.367214           Private  1.113276       HS-grad      -0.417870   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married      missing_value       Own-child   \n",
       "34187     Married-civ-spouse      missing_value            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male     1.054765    -0.206016      0.053470   \n",
       "1                   White    Male    -0.271118    -0.206016     -2.185441   \n",
       "2                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "3                   Black    Male    -0.271118    -0.206016      0.053470   \n",
       "4                   Black  Female    -0.271118    -0.206016      0.053470   \n",
       "5                   White  Female    -0.271118    -0.206016      0.053470   \n",
       "6                   Black  Female    -0.271118    -0.206016     -2.185441   \n",
       "7                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "8                   White  Female     5.032415    -0.206016      1.172925   \n",
       "9                   White    Male     2.380648    -0.206016      0.053470   \n",
       "10                  Black    Male    -0.271118    -0.206016      2.292380   \n",
       "11     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "12                  White  Female    -0.271118    -0.206016     -1.065986   \n",
       "13                  Black    Male    -0.271118    -0.206016      1.172925   \n",
       "14     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34175               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34176               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34177               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34178               White    Male    -0.271118    -0.206016     -2.185441   \n",
       "34179               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34180  Asian-Pac-Islander    Male    -0.271118    -0.206016     -2.185441   \n",
       "34181               White    Male    -0.271118     5.199568      0.053470   \n",
       "34182               White    Male    -0.271118    -0.206016     -1.065986   \n",
       "34183               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34184               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34185               White    Male    -0.271118     7.001429      0.053470   \n",
       "34186               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34187               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34188               White    Male    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14     missing_value   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176  missing_value   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "contained-chester",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>missing_value</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.181564  Self-emp-not-inc  0.704744  Some-college      -0.030259   \n",
       "34190  0.181564         State-gov -1.275191     Bachelors       1.132573   \n",
       "34191 -1.367214           Private -0.147766     Assoc-voc       0.357352   \n",
       "34192  0.955953         State-gov  0.655990  Some-college      -0.030259   \n",
       "34193  0.955953           Private  0.818617       HS-grad      -0.417870   \n",
       "34194 -1.367214           Private -0.335985  Some-college      -0.030259   \n",
       "34195 -0.592825         Local-gov  0.197621  Some-college      -0.030259   \n",
       "34196 -0.592825           Private  1.407546  Some-college      -0.030259   \n",
       "34197 -0.592825         State-gov  0.149067     Bachelors       1.132573   \n",
       "34198 -1.367214           Private -0.020718  Some-college      -0.030259   \n",
       "34199 -0.592825           Private -0.342117           9th      -1.968313   \n",
       "34200 -0.592825         Local-gov -0.376300  Some-college      -0.030259   \n",
       "34201  0.181564           Private  1.987078  Some-college      -0.030259   \n",
       "34202 -1.367214     missing_value  0.042399  Some-college      -0.030259   \n",
       "34203  0.181564           Private -1.382011    Assoc-acdm       0.744962   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "48827  0.955953           Private  0.332483       HS-grad      -0.417870   \n",
       "48828  0.181564           Private  0.549787     Assoc-voc       0.357352   \n",
       "48829  1.730342           Private  0.978500    Assoc-acdm       0.744962   \n",
       "48830 -0.592825           Private -0.153595       HS-grad      -0.417870   \n",
       "48831  0.955953           Private  0.910723       HS-grad      -0.417870   \n",
       "48832  1.730342           Private -0.948722       HS-grad      -0.417870   \n",
       "48833 -0.592825           Private  2.377888       HS-grad      -0.417870   \n",
       "48834 -1.367214           Private  1.531605       HS-grad      -0.417870   \n",
       "48835  0.955953         Local-gov  1.515021       Masters       1.520184   \n",
       "48836 -0.592825           Private  0.527612     Bachelors       1.132573   \n",
       "48837  0.181564           Private  0.244809     Bachelors       1.132573   \n",
       "48838  1.730342     missing_value  1.250871       HS-grad      -0.417870   \n",
       "48839  0.181564           Private  1.759484     Bachelors       1.132573   \n",
       "48840  0.181564           Private -1.003732     Bachelors       1.132573   \n",
       "48841 -0.592825      Self-emp-inc -0.071019     Bachelors       1.132573   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married      missing_value       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed      missing_value  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34190               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34191               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34192               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34193               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34194               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34195               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34196               Black  Female    -0.271118    -0.206016     -1.065986   \n",
       "34197               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34198               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34199               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34200               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34201               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34202               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34203               White    Male    -0.271118    -0.206016      1.172925   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "48828               Black  Female    -0.271118    -0.206016      0.053470   \n",
       "48829               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48830               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48831               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48832               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48833               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48834               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48835               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48836               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48837               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48838               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "48839               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48840  Asian-Pac-Islander    Male     2.380648    -0.206016      0.053470   \n",
       "48841               White    Male    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "foster-census",
   "metadata": {},
   "source": [
    "#### cat_imputation=\"most_frequent\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "absent-laugh",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                cat_imputation=\"most_frequent\", \n",
    "                                                cat_encoding=\"no_encoding\", cat_null_value=['?'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "federal-middle",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.181564         State-gov -1.064247     Bachelors       1.132573   \n",
       "1      0.955953  Self-emp-not-inc -1.009237     Bachelors       1.132573   \n",
       "2      0.181564           Private  0.246964       HS-grad      -0.417870   \n",
       "3      0.955953           Private  0.428035          11th      -1.193092   \n",
       "4     -0.592825           Private  1.412302     Bachelors       1.132573   \n",
       "5      0.181564           Private  0.901345       Masters       1.520184   \n",
       "6      0.955953           Private -0.279485           9th      -1.968313   \n",
       "7      0.955953  Self-emp-not-inc  0.189970       HS-grad      -0.417870   \n",
       "8     -0.592825           Private -1.365494       Masters       1.520184   \n",
       "9      0.181564           Private -0.286491     Bachelors       1.132573   \n",
       "10     0.181564           Private  0.862254  Some-college      -0.030259   \n",
       "11    -0.592825         State-gov -0.458800     Bachelors       1.132573   \n",
       "12    -1.367214           Private -0.639397     Bachelors       1.132573   \n",
       "13    -0.592825           Private  0.146086    Assoc-acdm       0.744962   \n",
       "14     0.181564           Private -0.644143     Assoc-voc       0.357352   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "34174  0.181564           Private -0.151677  Some-college      -0.030259   \n",
       "34175  0.955953           Private -0.382480       HS-grad      -0.417870   \n",
       "34176  1.730342           Private -0.407759       HS-grad      -0.417870   \n",
       "34177  1.730342           Private -0.153272     Bachelors       1.132573   \n",
       "34178 -1.367214           Private  0.323123          11th      -1.193092   \n",
       "34179  0.955953           Private -0.070743  Some-college      -0.030259   \n",
       "34180 -1.367214           Private -0.761452  Some-college      -0.030259   \n",
       "34181  0.955953      Self-emp-inc -0.367482  Some-college      -0.030259   \n",
       "34182  1.730342  Self-emp-not-inc -1.428648       HS-grad      -0.417870   \n",
       "34183  0.955953         Local-gov -0.817212     Bachelors       1.132573   \n",
       "34184  1.730342           Private -0.753877       HS-grad      -0.417870   \n",
       "34185  0.181564           Private  0.311551       HS-grad      -0.417870   \n",
       "34186 -1.367214           Private -0.720198       HS-grad      -0.417870   \n",
       "34187  0.181564           Private  0.608356          11th      -1.193092   \n",
       "34188 -1.367214           Private  1.113276       HS-grad      -0.417870   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married     Prof-specialty       Own-child   \n",
       "34187     Married-civ-spouse     Prof-specialty            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male     1.054765    -0.206016      0.053470   \n",
       "1                   White    Male    -0.271118    -0.206016     -2.185441   \n",
       "2                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "3                   Black    Male    -0.271118    -0.206016      0.053470   \n",
       "4                   Black  Female    -0.271118    -0.206016      0.053470   \n",
       "5                   White  Female    -0.271118    -0.206016      0.053470   \n",
       "6                   Black  Female    -0.271118    -0.206016     -2.185441   \n",
       "7                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "8                   White  Female     5.032415    -0.206016      1.172925   \n",
       "9                   White    Male     2.380648    -0.206016      0.053470   \n",
       "10                  Black    Male    -0.271118    -0.206016      2.292380   \n",
       "11     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "12                  White  Female    -0.271118    -0.206016     -1.065986   \n",
       "13                  Black    Male    -0.271118    -0.206016      1.172925   \n",
       "14     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34175               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34176               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34177               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34178               White    Male    -0.271118    -0.206016     -2.185441   \n",
       "34179               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34180  Asian-Pac-Islander    Male    -0.271118    -0.206016     -2.185441   \n",
       "34181               White    Male    -0.271118     5.199568      0.053470   \n",
       "34182               White    Male    -0.271118    -0.206016     -1.065986   \n",
       "34183               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34184               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34185               White    Male    -0.271118     7.001429      0.053470   \n",
       "34186               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34187               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34188               White    Male    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14     United-States   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176  United-States   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "median-cisco",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.181564  Self-emp-not-inc  0.704744  Some-college      -0.030259   \n",
       "34190  0.181564         State-gov -1.275191     Bachelors       1.132573   \n",
       "34191 -1.367214           Private -0.147766     Assoc-voc       0.357352   \n",
       "34192  0.955953         State-gov  0.655990  Some-college      -0.030259   \n",
       "34193  0.955953           Private  0.818617       HS-grad      -0.417870   \n",
       "34194 -1.367214           Private -0.335985  Some-college      -0.030259   \n",
       "34195 -0.592825         Local-gov  0.197621  Some-college      -0.030259   \n",
       "34196 -0.592825           Private  1.407546  Some-college      -0.030259   \n",
       "34197 -0.592825         State-gov  0.149067     Bachelors       1.132573   \n",
       "34198 -1.367214           Private -0.020718  Some-college      -0.030259   \n",
       "34199 -0.592825           Private -0.342117           9th      -1.968313   \n",
       "34200 -0.592825         Local-gov -0.376300  Some-college      -0.030259   \n",
       "34201  0.181564           Private  1.987078  Some-college      -0.030259   \n",
       "34202 -1.367214           Private  0.042399  Some-college      -0.030259   \n",
       "34203  0.181564           Private -1.382011    Assoc-acdm       0.744962   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "48827  0.955953           Private  0.332483       HS-grad      -0.417870   \n",
       "48828  0.181564           Private  0.549787     Assoc-voc       0.357352   \n",
       "48829  1.730342           Private  0.978500    Assoc-acdm       0.744962   \n",
       "48830 -0.592825           Private -0.153595       HS-grad      -0.417870   \n",
       "48831  0.955953           Private  0.910723       HS-grad      -0.417870   \n",
       "48832  1.730342           Private -0.948722       HS-grad      -0.417870   \n",
       "48833 -0.592825           Private  2.377888       HS-grad      -0.417870   \n",
       "48834 -1.367214           Private  1.531605       HS-grad      -0.417870   \n",
       "48835  0.955953         Local-gov  1.515021       Masters       1.520184   \n",
       "48836 -0.592825           Private  0.527612     Bachelors       1.132573   \n",
       "48837  0.181564           Private  0.244809     Bachelors       1.132573   \n",
       "48838  1.730342           Private  1.250871       HS-grad      -0.417870   \n",
       "48839  0.181564           Private  1.759484     Bachelors       1.132573   \n",
       "48840  0.181564           Private -1.003732     Bachelors       1.132573   \n",
       "48841 -0.592825      Self-emp-inc -0.071019     Bachelors       1.132573   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married     Prof-specialty       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed     Prof-specialty  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34190               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34191               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34192               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34193               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34194               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34195               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34196               Black  Female    -0.271118    -0.206016     -1.065986   \n",
       "34197               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34198               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34199               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34200               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34201               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34202               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34203               White    Male    -0.271118    -0.206016      1.172925   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "48828               Black  Female    -0.271118    -0.206016      0.053470   \n",
       "48829               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48830               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48831               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48832               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48833               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48834               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48835               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48836               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48837               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48838               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "48839               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48840  Asian-Pac-Islander    Male     2.380648    -0.206016      0.053470   \n",
       "48841               White    Male    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "unnecessary-answer",
   "metadata": {},
   "source": [
    "#### cat_imputation=\"drop\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "norman-conjunction",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                cat_imputation=\"drop\", \n",
    "                                                cat_encoding=\"no_encoding\", cat_null_value=['?'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "sound-declaration",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age    fnlwgt     education  education-num         marital-status  \\\n",
       "0      0.181564 -1.064247     Bachelors       1.132573          Never-married   \n",
       "1      0.955953 -1.009237     Bachelors       1.132573     Married-civ-spouse   \n",
       "2      0.181564  0.246964       HS-grad      -0.417870               Divorced   \n",
       "3      0.955953  0.428035          11th      -1.193092     Married-civ-spouse   \n",
       "4     -0.592825  1.412302     Bachelors       1.132573     Married-civ-spouse   \n",
       "5      0.181564  0.901345       Masters       1.520184     Married-civ-spouse   \n",
       "6      0.955953 -0.279485           9th      -1.968313  Married-spouse-absent   \n",
       "7      0.955953  0.189970       HS-grad      -0.417870     Married-civ-spouse   \n",
       "8     -0.592825 -1.365494       Masters       1.520184          Never-married   \n",
       "9      0.181564 -0.286491     Bachelors       1.132573     Married-civ-spouse   \n",
       "10     0.181564  0.862254  Some-college      -0.030259     Married-civ-spouse   \n",
       "11    -0.592825 -0.458800     Bachelors       1.132573     Married-civ-spouse   \n",
       "12    -1.367214 -0.639397     Bachelors       1.132573          Never-married   \n",
       "13    -0.592825  0.146086    Assoc-acdm       0.744962          Never-married   \n",
       "14     0.181564 -0.644143     Assoc-voc       0.357352     Married-civ-spouse   \n",
       "...         ...       ...           ...            ...                    ...   \n",
       "34174  0.181564 -0.151677  Some-college      -0.030259     Married-civ-spouse   \n",
       "34175  0.955953 -0.382480       HS-grad      -0.417870              Separated   \n",
       "34176  1.730342 -0.407759       HS-grad      -0.417870     Married-civ-spouse   \n",
       "34177  1.730342 -0.153272     Bachelors       1.132573               Divorced   \n",
       "34178 -1.367214  0.323123          11th      -1.193092          Never-married   \n",
       "34179  0.955953 -0.070743  Some-college      -0.030259               Divorced   \n",
       "34180 -1.367214 -0.761452  Some-college      -0.030259          Never-married   \n",
       "34181  0.955953 -0.367482  Some-college      -0.030259     Married-civ-spouse   \n",
       "34182  1.730342 -1.428648       HS-grad      -0.417870     Married-civ-spouse   \n",
       "34183  0.955953 -0.817212     Bachelors       1.132573     Married-civ-spouse   \n",
       "34184  1.730342 -0.753877       HS-grad      -0.417870     Married-civ-spouse   \n",
       "34185  0.181564  0.311551       HS-grad      -0.417870          Never-married   \n",
       "34186 -1.367214 -0.720198       HS-grad      -0.417870          Never-married   \n",
       "34187  0.181564  0.608356          11th      -1.193092     Married-civ-spouse   \n",
       "34188 -1.367214  1.113276       HS-grad      -0.417870     Married-civ-spouse   \n",
       "\n",
       "         relationship                race     sex  capitalgain  capitalloss  \\\n",
       "0       Not-in-family               White    Male     1.054765    -0.206016   \n",
       "1             Husband               White    Male    -0.271118    -0.206016   \n",
       "2       Not-in-family               White    Male    -0.271118    -0.206016   \n",
       "3             Husband               Black    Male    -0.271118    -0.206016   \n",
       "4                Wife               Black  Female    -0.271118    -0.206016   \n",
       "5                Wife               White  Female    -0.271118    -0.206016   \n",
       "6       Not-in-family               Black  Female    -0.271118    -0.206016   \n",
       "7             Husband               White    Male    -0.271118    -0.206016   \n",
       "8       Not-in-family               White  Female     5.032415    -0.206016   \n",
       "9             Husband               White    Male     2.380648    -0.206016   \n",
       "10            Husband               Black    Male    -0.271118    -0.206016   \n",
       "11            Husband  Asian-Pac-Islander    Male    -0.271118    -0.206016   \n",
       "12          Own-child               White  Female    -0.271118    -0.206016   \n",
       "13      Not-in-family               Black    Male    -0.271118    -0.206016   \n",
       "14            Husband  Asian-Pac-Islander    Male    -0.271118    -0.206016   \n",
       "...               ...                 ...     ...          ...          ...   \n",
       "34174         Husband               White    Male    -0.271118    -0.206016   \n",
       "34175   Not-in-family               White    Male    -0.271118    -0.206016   \n",
       "34176         Husband               Black    Male    -0.271118    -0.206016   \n",
       "34177   Not-in-family               White  Female    -0.271118    -0.206016   \n",
       "34178       Own-child               White    Male    -0.271118    -0.206016   \n",
       "34179       Unmarried               White  Female    -0.271118    -0.206016   \n",
       "34180  Other-relative  Asian-Pac-Islander    Male    -0.271118    -0.206016   \n",
       "34181         Husband               White    Male    -0.271118     5.199568   \n",
       "34182         Husband               White    Male    -0.271118    -0.206016   \n",
       "34183         Husband               White    Male    -0.271118    -0.206016   \n",
       "34184         Husband               Black    Male    -0.271118    -0.206016   \n",
       "34185   Not-in-family               White    Male    -0.271118     7.001429   \n",
       "34186       Own-child               White  Female    -0.271118    -0.206016   \n",
       "34187            Wife               White  Female    -0.271118    -0.206016   \n",
       "34188         Husband               White    Male    -0.271118    -0.206016   \n",
       "\n",
       "       hoursperweek  class  \n",
       "0          0.053470  <=50K  \n",
       "1         -2.185441  <=50K  \n",
       "2          0.053470  <=50K  \n",
       "3          0.053470  <=50K  \n",
       "4          0.053470  <=50K  \n",
       "5          0.053470  <=50K  \n",
       "6         -2.185441  <=50K  \n",
       "7          0.053470   >50K  \n",
       "8          1.172925   >50K  \n",
       "9          0.053470   >50K  \n",
       "10         2.292380   >50K  \n",
       "11         0.053470   >50K  \n",
       "12        -1.065986  <=50K  \n",
       "13         1.172925  <=50K  \n",
       "14         0.053470   >50K  \n",
       "...             ...    ...  \n",
       "34174      0.053470   >50K  \n",
       "34175      0.053470  <=50K  \n",
       "34176      0.053470   >50K  \n",
       "34177     -1.065986  <=50K  \n",
       "34178     -2.185441  <=50K  \n",
       "34179     -1.065986  <=50K  \n",
       "34180     -2.185441  <=50K  \n",
       "34181      0.053470   >50K  \n",
       "34182     -1.065986  <=50K  \n",
       "34183      1.172925   >50K  \n",
       "34184      0.053470  <=50K  \n",
       "34185      0.053470  <=50K  \n",
       "34186      0.053470  <=50K  \n",
       "34187     -2.185441  <=50K  \n",
       "34188      0.053470  <=50K  \n",
       "\n",
       "[34189 rows x 12 columns]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "future-ferry",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age    fnlwgt     education  education-num         marital-status  \\\n",
       "34189  0.181564  0.704744  Some-college      -0.030259     Married-civ-spouse   \n",
       "34190  0.181564 -1.275191     Bachelors       1.132573     Married-civ-spouse   \n",
       "34191 -1.367214 -0.147766     Assoc-voc       0.357352          Never-married   \n",
       "34192  0.955953  0.655990  Some-college      -0.030259     Married-civ-spouse   \n",
       "34193  0.955953  0.818617       HS-grad      -0.417870     Married-civ-spouse   \n",
       "34194 -1.367214 -0.335985  Some-college      -0.030259          Never-married   \n",
       "34195 -0.592825  0.197621  Some-college      -0.030259     Married-civ-spouse   \n",
       "34196 -0.592825  1.407546  Some-college      -0.030259               Divorced   \n",
       "34197 -0.592825  0.149067     Bachelors       1.132573          Never-married   \n",
       "34198 -1.367214 -0.020718  Some-college      -0.030259              Separated   \n",
       "34199 -0.592825 -0.342117           9th      -1.968313              Separated   \n",
       "34200 -0.592825 -0.376300  Some-college      -0.030259               Divorced   \n",
       "34201  0.181564  1.987078  Some-college      -0.030259     Married-civ-spouse   \n",
       "34202 -1.367214  0.042399  Some-college      -0.030259          Never-married   \n",
       "34203  0.181564 -1.382011    Assoc-acdm       0.744962  Married-spouse-absent   \n",
       "...         ...       ...           ...            ...                    ...   \n",
       "48827  0.955953  0.332483       HS-grad      -0.417870              Separated   \n",
       "48828  0.181564  0.549787     Assoc-voc       0.357352          Never-married   \n",
       "48829  1.730342  0.978500    Assoc-acdm       0.744962               Divorced   \n",
       "48830 -0.592825 -0.153595       HS-grad      -0.417870     Married-civ-spouse   \n",
       "48831  0.955953  0.910723       HS-grad      -0.417870     Married-civ-spouse   \n",
       "48832  1.730342 -0.948722       HS-grad      -0.417870     Married-civ-spouse   \n",
       "48833 -0.592825  2.377888       HS-grad      -0.417870     Married-civ-spouse   \n",
       "48834 -1.367214  1.531605       HS-grad      -0.417870          Never-married   \n",
       "48835  0.955953  1.515021       Masters       1.520184               Divorced   \n",
       "48836 -0.592825  0.527612     Bachelors       1.132573          Never-married   \n",
       "48837  0.181564  0.244809     Bachelors       1.132573               Divorced   \n",
       "48838  1.730342  1.250871       HS-grad      -0.417870                Widowed   \n",
       "48839  0.181564  1.759484     Bachelors       1.132573     Married-civ-spouse   \n",
       "48840  0.181564 -1.003732     Bachelors       1.132573               Divorced   \n",
       "48841 -0.592825 -0.071019     Bachelors       1.132573     Married-civ-spouse   \n",
       "\n",
       "         relationship                race     sex  capitalgain  capitalloss  \\\n",
       "34189         Husband               White    Male    -0.271118    -0.206016   \n",
       "34190         Husband               White    Male    -0.271118    -0.206016   \n",
       "34191       Own-child               White  Female    -0.271118    -0.206016   \n",
       "34192         Husband               White    Male    -0.271118    -0.206016   \n",
       "34193         Husband               White    Male    -0.271118    -0.206016   \n",
       "34194       Own-child               White  Female    -0.271118    -0.206016   \n",
       "34195  Other-relative               White    Male    -0.271118    -0.206016   \n",
       "34196       Unmarried               Black  Female    -0.271118    -0.206016   \n",
       "34197   Not-in-family               White  Female    -0.271118    -0.206016   \n",
       "34198       Own-child               White    Male    -0.271118    -0.206016   \n",
       "34199   Not-in-family               White    Male    -0.271118    -0.206016   \n",
       "34200       Unmarried               White  Female    -0.271118    -0.206016   \n",
       "34201         Husband               White    Male    -0.271118    -0.206016   \n",
       "34202       Own-child               White  Female    -0.271118    -0.206016   \n",
       "34203  Other-relative               White    Male    -0.271118    -0.206016   \n",
       "...               ...                 ...     ...          ...          ...   \n",
       "48827   Not-in-family               White  Female    -0.271118    -0.206016   \n",
       "48828       Unmarried               Black  Female    -0.271118    -0.206016   \n",
       "48829   Not-in-family               White    Male    -0.271118    -0.206016   \n",
       "48830         Husband               White    Male    -0.271118    -0.206016   \n",
       "48831         Husband               White    Male    -0.271118    -0.206016   \n",
       "48832         Husband               White    Male    -0.271118    -0.206016   \n",
       "48833         Husband               White    Male    -0.271118    -0.206016   \n",
       "48834       Own-child               White  Female    -0.271118    -0.206016   \n",
       "48835   Not-in-family               White    Male    -0.271118    -0.206016   \n",
       "48836       Own-child               White    Male    -0.271118    -0.206016   \n",
       "48837   Not-in-family               White  Female    -0.271118    -0.206016   \n",
       "48838  Other-relative               Black    Male    -0.271118    -0.206016   \n",
       "48839         Husband               White    Male    -0.271118    -0.206016   \n",
       "48840       Own-child  Asian-Pac-Islander    Male     2.380648    -0.206016   \n",
       "48841         Husband               White    Male    -0.271118    -0.206016   \n",
       "\n",
       "       hoursperweek  class  \n",
       "34189      1.172925  <=50K  \n",
       "34190      0.053470   >50K  \n",
       "34191     -2.185441  <=50K  \n",
       "34192      0.053470   >50K  \n",
       "34193      0.053470  <=50K  \n",
       "34194     -1.065986  <=50K  \n",
       "34195      0.053470  <=50K  \n",
       "34196     -1.065986  <=50K  \n",
       "34197     -2.185441  <=50K  \n",
       "34198      0.053470  <=50K  \n",
       "34199      0.053470  <=50K  \n",
       "34200      0.053470  <=50K  \n",
       "34201      0.053470  <=50K  \n",
       "34202     -1.065986  <=50K  \n",
       "34203      1.172925  <=50K  \n",
       "...             ...    ...  \n",
       "48827     -1.065986  <=50K  \n",
       "48828      0.053470  <=50K  \n",
       "48829      0.053470  <=50K  \n",
       "48830      0.053470  <=50K  \n",
       "48831      0.053470  <=50K  \n",
       "48832      1.172925  <=50K  \n",
       "48833      0.053470  <=50K  \n",
       "48834      0.053470  <=50K  \n",
       "48835      0.053470  <=50K  \n",
       "48836      0.053470  <=50K  \n",
       "48837      0.053470  <=50K  \n",
       "48838      0.053470  <=50K  \n",
       "48839      1.172925  <=50K  \n",
       "48840      0.053470  <=50K  \n",
       "48841      1.172925   >50K  \n",
       "\n",
       "[14653 rows x 12 columns]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ethical-faculty",
   "metadata": {},
   "source": [
    "## 3. `fill_val` parameter\n",
    "\n",
    "By default, the filling value of categorical missing value is \"missing value\". However, user can specify this string with whatever string they like, such as `\"missing\"`, `\"NaN\"`, `\"I'm a cat.\"`, `\"Fyodor Dostoyevsky\"`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "several-scanner",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                cat_null_value=['?'], cat_encoding=\"no_encoding\",\n",
    "                                                fill_val=\"AHAHAHAHAHA!!!\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "weird-effect",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.181564         State-gov -1.064247     Bachelors       1.132573   \n",
       "1      0.955953  Self-emp-not-inc -1.009237     Bachelors       1.132573   \n",
       "2      0.181564           Private  0.246964       HS-grad      -0.417870   \n",
       "3      0.955953           Private  0.428035          11th      -1.193092   \n",
       "4     -0.592825           Private  1.412302     Bachelors       1.132573   \n",
       "5      0.181564           Private  0.901345       Masters       1.520184   \n",
       "6      0.955953           Private -0.279485           9th      -1.968313   \n",
       "7      0.955953  Self-emp-not-inc  0.189970       HS-grad      -0.417870   \n",
       "8     -0.592825           Private -1.365494       Masters       1.520184   \n",
       "9      0.181564           Private -0.286491     Bachelors       1.132573   \n",
       "10     0.181564           Private  0.862254  Some-college      -0.030259   \n",
       "11    -0.592825         State-gov -0.458800     Bachelors       1.132573   \n",
       "12    -1.367214           Private -0.639397     Bachelors       1.132573   \n",
       "13    -0.592825           Private  0.146086    Assoc-acdm       0.744962   \n",
       "14     0.181564           Private -0.644143     Assoc-voc       0.357352   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "34174  0.181564           Private -0.151677  Some-college      -0.030259   \n",
       "34175  0.955953           Private -0.382480       HS-grad      -0.417870   \n",
       "34176  1.730342           Private -0.407759       HS-grad      -0.417870   \n",
       "34177  1.730342           Private -0.153272     Bachelors       1.132573   \n",
       "34178 -1.367214           Private  0.323123          11th      -1.193092   \n",
       "34179  0.955953           Private -0.070743  Some-college      -0.030259   \n",
       "34180 -1.367214           Private -0.761452  Some-college      -0.030259   \n",
       "34181  0.955953      Self-emp-inc -0.367482  Some-college      -0.030259   \n",
       "34182  1.730342  Self-emp-not-inc -1.428648       HS-grad      -0.417870   \n",
       "34183  0.955953         Local-gov -0.817212     Bachelors       1.132573   \n",
       "34184  1.730342           Private -0.753877       HS-grad      -0.417870   \n",
       "34185  0.181564           Private  0.311551       HS-grad      -0.417870   \n",
       "34186 -1.367214    AHAHAHAHAHA!!! -0.720198       HS-grad      -0.417870   \n",
       "34187  0.181564    AHAHAHAHAHA!!!  0.608356          11th      -1.193092   \n",
       "34188 -1.367214           Private  1.113276       HS-grad      -0.417870   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married     AHAHAHAHAHA!!!       Own-child   \n",
       "34187     Married-civ-spouse     AHAHAHAHAHA!!!            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male     1.054765    -0.206016      0.053470   \n",
       "1                   White    Male    -0.271118    -0.206016     -2.185441   \n",
       "2                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "3                   Black    Male    -0.271118    -0.206016      0.053470   \n",
       "4                   Black  Female    -0.271118    -0.206016      0.053470   \n",
       "5                   White  Female    -0.271118    -0.206016      0.053470   \n",
       "6                   Black  Female    -0.271118    -0.206016     -2.185441   \n",
       "7                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "8                   White  Female     5.032415    -0.206016      1.172925   \n",
       "9                   White    Male     2.380648    -0.206016      0.053470   \n",
       "10                  Black    Male    -0.271118    -0.206016      2.292380   \n",
       "11     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "12                  White  Female    -0.271118    -0.206016     -1.065986   \n",
       "13                  Black    Male    -0.271118    -0.206016      1.172925   \n",
       "14     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34175               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34176               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34177               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34178               White    Male    -0.271118    -0.206016     -2.185441   \n",
       "34179               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34180  Asian-Pac-Islander    Male    -0.271118    -0.206016     -2.185441   \n",
       "34181               White    Male    -0.271118     5.199568      0.053470   \n",
       "34182               White    Male    -0.271118    -0.206016     -1.065986   \n",
       "34183               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34184               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34185               White    Male    -0.271118     7.001429      0.053470   \n",
       "34186               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34187               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34188               White    Male    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "       native-country  class  \n",
       "0       United-States  <=50K  \n",
       "1       United-States  <=50K  \n",
       "2       United-States  <=50K  \n",
       "3       United-States  <=50K  \n",
       "4                Cuba  <=50K  \n",
       "5       United-States  <=50K  \n",
       "6             Jamaica  <=50K  \n",
       "7       United-States   >50K  \n",
       "8       United-States   >50K  \n",
       "9       United-States   >50K  \n",
       "10      United-States   >50K  \n",
       "11              India   >50K  \n",
       "12      United-States  <=50K  \n",
       "13      United-States  <=50K  \n",
       "14     AHAHAHAHAHA!!!   >50K  \n",
       "...               ...    ...  \n",
       "34174   United-States   >50K  \n",
       "34175   United-States  <=50K  \n",
       "34176  AHAHAHAHAHA!!!   >50K  \n",
       "34177   United-States  <=50K  \n",
       "34178   United-States  <=50K  \n",
       "34179   United-States  <=50K  \n",
       "34180           India  <=50K  \n",
       "34181   United-States   >50K  \n",
       "34182   United-States  <=50K  \n",
       "34183   United-States   >50K  \n",
       "34184   United-States  <=50K  \n",
       "34185     El-Salvador  <=50K  \n",
       "34186   United-States  <=50K  \n",
       "34187   United-States  <=50K  \n",
       "34188   United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "perfect-shell",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>AHAHAHAHAHA!!!</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.181564  Self-emp-not-inc  0.704744  Some-college      -0.030259   \n",
       "34190  0.181564         State-gov -1.275191     Bachelors       1.132573   \n",
       "34191 -1.367214           Private -0.147766     Assoc-voc       0.357352   \n",
       "34192  0.955953         State-gov  0.655990  Some-college      -0.030259   \n",
       "34193  0.955953           Private  0.818617       HS-grad      -0.417870   \n",
       "34194 -1.367214           Private -0.335985  Some-college      -0.030259   \n",
       "34195 -0.592825         Local-gov  0.197621  Some-college      -0.030259   \n",
       "34196 -0.592825           Private  1.407546  Some-college      -0.030259   \n",
       "34197 -0.592825         State-gov  0.149067     Bachelors       1.132573   \n",
       "34198 -1.367214           Private -0.020718  Some-college      -0.030259   \n",
       "34199 -0.592825           Private -0.342117           9th      -1.968313   \n",
       "34200 -0.592825         Local-gov -0.376300  Some-college      -0.030259   \n",
       "34201  0.181564           Private  1.987078  Some-college      -0.030259   \n",
       "34202 -1.367214    AHAHAHAHAHA!!!  0.042399  Some-college      -0.030259   \n",
       "34203  0.181564           Private -1.382011    Assoc-acdm       0.744962   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "48827  0.955953           Private  0.332483       HS-grad      -0.417870   \n",
       "48828  0.181564           Private  0.549787     Assoc-voc       0.357352   \n",
       "48829  1.730342           Private  0.978500    Assoc-acdm       0.744962   \n",
       "48830 -0.592825           Private -0.153595       HS-grad      -0.417870   \n",
       "48831  0.955953           Private  0.910723       HS-grad      -0.417870   \n",
       "48832  1.730342           Private -0.948722       HS-grad      -0.417870   \n",
       "48833 -0.592825           Private  2.377888       HS-grad      -0.417870   \n",
       "48834 -1.367214           Private  1.531605       HS-grad      -0.417870   \n",
       "48835  0.955953         Local-gov  1.515021       Masters       1.520184   \n",
       "48836 -0.592825           Private  0.527612     Bachelors       1.132573   \n",
       "48837  0.181564           Private  0.244809     Bachelors       1.132573   \n",
       "48838  1.730342    AHAHAHAHAHA!!!  1.250871       HS-grad      -0.417870   \n",
       "48839  0.181564           Private  1.759484     Bachelors       1.132573   \n",
       "48840  0.181564           Private -1.003732     Bachelors       1.132573   \n",
       "48841 -0.592825      Self-emp-inc -0.071019     Bachelors       1.132573   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married     AHAHAHAHAHA!!!       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed     AHAHAHAHAHA!!!  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34190               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34191               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34192               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34193               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34194               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34195               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34196               Black  Female    -0.271118    -0.206016     -1.065986   \n",
       "34197               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34198               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34199               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34200               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34201               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34202               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34203               White    Male    -0.271118    -0.206016      1.172925   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "48828               Black  Female    -0.271118    -0.206016      0.053470   \n",
       "48829               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48830               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48831               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48832               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48833               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48834               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48835               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48836               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48837               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48838               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "48839               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48840  Asian-Pac-Islander    Male     2.380648    -0.206016      0.053470   \n",
       "48841               White    Male    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "modern-retro",
   "metadata": {},
   "source": [
    "## 4. `num_imputation` and  `num_null_value` parameter\n",
    "There are three choices for `num_imputation` parameter:\n",
    "* `mean`: filling the missing value with mean value of this column. \n",
    "* `meduab`: filling the missing value with median value of this column. \n",
    "* `most_frequent`:  filling the missing value with most frequent value of this column.\n",
    "* `drop`: drop this column if there are missing values.\n",
    "\n",
    "The default null values are same to the null values metioned in `cat_imputation` parameter.\n",
    "\n",
    "The imputing process is quite similar with the `cat_imputation` parameter section. Thus, we don't show redundant examples here.\n",
    "\n",
    "`num_null_value` parameter is a list including user-specified null values. The element in this list can be any type. For example:\n",
    "* ['?']\n",
    "* ['abc', np.nan, '?', 1265]\n",
    "\n",
    "The usage of `num_null_value` parameter is same to `cat_null_value` parameter. Thus we don't show redundant examples here."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "harmful-norway",
   "metadata": {},
   "source": [
    "## 5. `cat_encoding` parameter\n",
    "\n",
    "There are three choices for `cat_encoding` parameter:\n",
    "* `no_encoding`: don't do any encoding for categorical columns.\n",
    "* `one_hot`: do one_hot encoding for categorical columns.\n",
    "\n",
    "The default value is `one_hot`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "supposed-recognition",
   "metadata": {},
   "source": [
    "#### cat_encoding = \"no_encoding\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "distributed-funds",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", cat_encoding=\"no_encoding\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "directed-tumor",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>?</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>?</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>11th</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>?</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.181564         State-gov -1.064247     Bachelors       1.132573   \n",
       "1      0.955953  Self-emp-not-inc -1.009237     Bachelors       1.132573   \n",
       "2      0.181564           Private  0.246964       HS-grad      -0.417870   \n",
       "3      0.955953           Private  0.428035          11th      -1.193092   \n",
       "4     -0.592825           Private  1.412302     Bachelors       1.132573   \n",
       "5      0.181564           Private  0.901345       Masters       1.520184   \n",
       "6      0.955953           Private -0.279485           9th      -1.968313   \n",
       "7      0.955953  Self-emp-not-inc  0.189970       HS-grad      -0.417870   \n",
       "8     -0.592825           Private -1.365494       Masters       1.520184   \n",
       "9      0.181564           Private -0.286491     Bachelors       1.132573   \n",
       "10     0.181564           Private  0.862254  Some-college      -0.030259   \n",
       "11    -0.592825         State-gov -0.458800     Bachelors       1.132573   \n",
       "12    -1.367214           Private -0.639397     Bachelors       1.132573   \n",
       "13    -0.592825           Private  0.146086    Assoc-acdm       0.744962   \n",
       "14     0.181564           Private -0.644143     Assoc-voc       0.357352   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "34174  0.181564           Private -0.151677  Some-college      -0.030259   \n",
       "34175  0.955953           Private -0.382480       HS-grad      -0.417870   \n",
       "34176  1.730342           Private -0.407759       HS-grad      -0.417870   \n",
       "34177  1.730342           Private -0.153272     Bachelors       1.132573   \n",
       "34178 -1.367214           Private  0.323123          11th      -1.193092   \n",
       "34179  0.955953           Private -0.070743  Some-college      -0.030259   \n",
       "34180 -1.367214           Private -0.761452  Some-college      -0.030259   \n",
       "34181  0.955953      Self-emp-inc -0.367482  Some-college      -0.030259   \n",
       "34182  1.730342  Self-emp-not-inc -1.428648       HS-grad      -0.417870   \n",
       "34183  0.955953         Local-gov -0.817212     Bachelors       1.132573   \n",
       "34184  1.730342           Private -0.753877       HS-grad      -0.417870   \n",
       "34185  0.181564           Private  0.311551       HS-grad      -0.417870   \n",
       "34186 -1.367214                 ? -0.720198       HS-grad      -0.417870   \n",
       "34187  0.181564                 ?  0.608356          11th      -1.193092   \n",
       "34188 -1.367214           Private  1.113276       HS-grad      -0.417870   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married                  ?       Own-child   \n",
       "34187     Married-civ-spouse                  ?            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male     1.054765    -0.206016      0.053470   \n",
       "1                   White    Male    -0.271118    -0.206016     -2.185441   \n",
       "2                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "3                   Black    Male    -0.271118    -0.206016      0.053470   \n",
       "4                   Black  Female    -0.271118    -0.206016      0.053470   \n",
       "5                   White  Female    -0.271118    -0.206016      0.053470   \n",
       "6                   Black  Female    -0.271118    -0.206016     -2.185441   \n",
       "7                   White    Male    -0.271118    -0.206016      0.053470   \n",
       "8                   White  Female     5.032415    -0.206016      1.172925   \n",
       "9                   White    Male     2.380648    -0.206016      0.053470   \n",
       "10                  Black    Male    -0.271118    -0.206016      2.292380   \n",
       "11     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "12                  White  Female    -0.271118    -0.206016     -1.065986   \n",
       "13                  Black    Male    -0.271118    -0.206016      1.172925   \n",
       "14     Asian-Pac-Islander    Male    -0.271118    -0.206016      0.053470   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34175               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34176               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34177               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34178               White    Male    -0.271118    -0.206016     -2.185441   \n",
       "34179               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34180  Asian-Pac-Islander    Male    -0.271118    -0.206016     -2.185441   \n",
       "34181               White    Male    -0.271118     5.199568      0.053470   \n",
       "34182               White    Male    -0.271118    -0.206016     -1.065986   \n",
       "34183               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34184               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "34185               White    Male    -0.271118     7.001429      0.053470   \n",
       "34186               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34187               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34188               White    Male    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14                 ?   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176              ?   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "alive-honey",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>9th</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>?</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>Private</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>Masters</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>?</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>?</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>Private</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.181564  Self-emp-not-inc  0.704744  Some-college      -0.030259   \n",
       "34190  0.181564         State-gov -1.275191     Bachelors       1.132573   \n",
       "34191 -1.367214           Private -0.147766     Assoc-voc       0.357352   \n",
       "34192  0.955953         State-gov  0.655990  Some-college      -0.030259   \n",
       "34193  0.955953           Private  0.818617       HS-grad      -0.417870   \n",
       "34194 -1.367214           Private -0.335985  Some-college      -0.030259   \n",
       "34195 -0.592825         Local-gov  0.197621  Some-college      -0.030259   \n",
       "34196 -0.592825           Private  1.407546  Some-college      -0.030259   \n",
       "34197 -0.592825         State-gov  0.149067     Bachelors       1.132573   \n",
       "34198 -1.367214           Private -0.020718  Some-college      -0.030259   \n",
       "34199 -0.592825           Private -0.342117           9th      -1.968313   \n",
       "34200 -0.592825         Local-gov -0.376300  Some-college      -0.030259   \n",
       "34201  0.181564           Private  1.987078  Some-college      -0.030259   \n",
       "34202 -1.367214                 ?  0.042399  Some-college      -0.030259   \n",
       "34203  0.181564           Private -1.382011    Assoc-acdm       0.744962   \n",
       "...         ...               ...       ...           ...            ...   \n",
       "48827  0.955953           Private  0.332483       HS-grad      -0.417870   \n",
       "48828  0.181564           Private  0.549787     Assoc-voc       0.357352   \n",
       "48829  1.730342           Private  0.978500    Assoc-acdm       0.744962   \n",
       "48830 -0.592825           Private -0.153595       HS-grad      -0.417870   \n",
       "48831  0.955953           Private  0.910723       HS-grad      -0.417870   \n",
       "48832  1.730342           Private -0.948722       HS-grad      -0.417870   \n",
       "48833 -0.592825           Private  2.377888       HS-grad      -0.417870   \n",
       "48834 -1.367214           Private  1.531605       HS-grad      -0.417870   \n",
       "48835  0.955953         Local-gov  1.515021       Masters       1.520184   \n",
       "48836 -0.592825           Private  0.527612     Bachelors       1.132573   \n",
       "48837  0.181564           Private  0.244809     Bachelors       1.132573   \n",
       "48838  1.730342                 ?  1.250871       HS-grad      -0.417870   \n",
       "48839  0.181564           Private  1.759484     Bachelors       1.132573   \n",
       "48840  0.181564           Private -1.003732     Bachelors       1.132573   \n",
       "48841 -0.592825      Self-emp-inc -0.071019     Bachelors       1.132573   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married                  ?       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed                  ?  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male    -0.271118    -0.206016      1.172925   \n",
       "34190               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34191               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34192               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34193               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34194               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34195               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34196               Black  Female    -0.271118    -0.206016     -1.065986   \n",
       "34197               White  Female    -0.271118    -0.206016     -2.185441   \n",
       "34198               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34199               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34200               White  Female    -0.271118    -0.206016      0.053470   \n",
       "34201               White    Male    -0.271118    -0.206016      0.053470   \n",
       "34202               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "34203               White    Male    -0.271118    -0.206016      1.172925   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female    -0.271118    -0.206016     -1.065986   \n",
       "48828               Black  Female    -0.271118    -0.206016      0.053470   \n",
       "48829               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48830               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48831               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48832               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48833               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48834               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48835               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48836               White    Male    -0.271118    -0.206016      0.053470   \n",
       "48837               White  Female    -0.271118    -0.206016      0.053470   \n",
       "48838               Black    Male    -0.271118    -0.206016      0.053470   \n",
       "48839               White    Male    -0.271118    -0.206016      1.172925   \n",
       "48840  Asian-Pac-Islander    Male     2.380648    -0.206016      0.053470   \n",
       "48841               White    Male    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "hazardous-designation",
   "metadata": {},
   "source": [
    "#### cat_encoding=\"one_hot\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "headed-hydrogen",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", cat_encoding=\"one_hot\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "swedish-fabric",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>1.054765</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>5.032415</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>2.292380</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>5.199568</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>7.001429</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age                                      workclass    fnlwgt  \\\n",
       "0      0.181564  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.064247   \n",
       "1      0.955953  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.009237   \n",
       "2      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.246964   \n",
       "3      0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.428035   \n",
       "4     -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.412302   \n",
       "5      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.901345   \n",
       "6      0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.279485   \n",
       "7      0.955953  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.189970   \n",
       "8     -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.365494   \n",
       "9      0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.286491   \n",
       "10     0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.862254   \n",
       "11    -0.592825  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.458800   \n",
       "12    -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.639397   \n",
       "13    -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.146086   \n",
       "14     0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.644143   \n",
       "...         ...                                            ...       ...   \n",
       "34174  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.151677   \n",
       "34175  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.382480   \n",
       "34176  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.407759   \n",
       "34177  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153272   \n",
       "34178 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.323123   \n",
       "34179  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.070743   \n",
       "34180 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.761452   \n",
       "34181  0.955953  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.367482   \n",
       "34182  1.730342  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.428648   \n",
       "34183  0.955953  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.817212   \n",
       "34184  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.753877   \n",
       "34185  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.311551   \n",
       "34186 -1.367214  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0] -0.720198   \n",
       "34187  0.181564  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.608356   \n",
       "34188 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.113276   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "2      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "4      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "5      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "10     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "11     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "...                                                  ...            ...   \n",
       "34174  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34175  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34178  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34181  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34185  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34186  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "\n",
       "                            marital-status  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "6      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34175  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34177  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34179  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34180  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34187  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "4      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "34174  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34175  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34177  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34178  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34182  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34183  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34184  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34186  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34187  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34188  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "4      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "5      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "6      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "12     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34178  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34186  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "       capitalgain  capitalloss  hoursperweek  \\\n",
       "0         1.054765    -0.206016      0.053470   \n",
       "1        -0.271118    -0.206016     -2.185441   \n",
       "2        -0.271118    -0.206016      0.053470   \n",
       "3        -0.271118    -0.206016      0.053470   \n",
       "4        -0.271118    -0.206016      0.053470   \n",
       "5        -0.271118    -0.206016      0.053470   \n",
       "6        -0.271118    -0.206016     -2.185441   \n",
       "7        -0.271118    -0.206016      0.053470   \n",
       "8         5.032415    -0.206016      1.172925   \n",
       "9         2.380648    -0.206016      0.053470   \n",
       "10       -0.271118    -0.206016      2.292380   \n",
       "11       -0.271118    -0.206016      0.053470   \n",
       "12       -0.271118    -0.206016     -1.065986   \n",
       "13       -0.271118    -0.206016      1.172925   \n",
       "14       -0.271118    -0.206016      0.053470   \n",
       "...            ...          ...           ...   \n",
       "34174    -0.271118    -0.206016      0.053470   \n",
       "34175    -0.271118    -0.206016      0.053470   \n",
       "34176    -0.271118    -0.206016      0.053470   \n",
       "34177    -0.271118    -0.206016     -1.065986   \n",
       "34178    -0.271118    -0.206016     -2.185441   \n",
       "34179    -0.271118    -0.206016     -1.065986   \n",
       "34180    -0.271118    -0.206016     -2.185441   \n",
       "34181    -0.271118     5.199568      0.053470   \n",
       "34182    -0.271118    -0.206016     -1.065986   \n",
       "34183    -0.271118    -0.206016      1.172925   \n",
       "34184    -0.271118    -0.206016      0.053470   \n",
       "34185    -0.271118     7.001429      0.053470   \n",
       "34186    -0.271118    -0.206016      0.053470   \n",
       "34187    -0.271118    -0.206016     -2.185441   \n",
       "34188    -0.271118    -0.206016      0.053470   \n",
       "\n",
       "                                          native-country  class  \n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "3      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "5      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "6      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "7      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "10     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "14     [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "...                                                  ...    ...  \n",
       "34174  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34179  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34180  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34181  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34182  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34184  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34187  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34188  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "advisory-shade",
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-2.185441</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>-1.065986</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>-1.367214</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.955953</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.730342</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.181564</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>2.380648</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>0.053470</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>-0.592825</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>-0.271118</td>\n",
       "      <td>-0.206016</td>\n",
       "      <td>1.172925</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            age                                      workclass    fnlwgt  \\\n",
       "34189  0.181564  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.704744   \n",
       "34190  0.181564  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.275191   \n",
       "34191 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.147766   \n",
       "34192  0.955953  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.655990   \n",
       "34193  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.818617   \n",
       "34194 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.335985   \n",
       "34195 -0.592825  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  0.197621   \n",
       "34196 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.407546   \n",
       "34197 -0.592825  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.149067   \n",
       "34198 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.020718   \n",
       "34199 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.342117   \n",
       "34200 -0.592825  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.376300   \n",
       "34201  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.987078   \n",
       "34202 -1.367214  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.042399   \n",
       "34203  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.382011   \n",
       "...         ...                                            ...       ...   \n",
       "48827  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.332483   \n",
       "48828  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.549787   \n",
       "48829  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.978500   \n",
       "48830 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153595   \n",
       "48831  0.955953  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.910723   \n",
       "48832  1.730342  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.948722   \n",
       "48833 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  2.377888   \n",
       "48834 -1.367214  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.531605   \n",
       "48835  0.955953  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  1.515021   \n",
       "48836 -0.592825  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.527612   \n",
       "48837  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.244809   \n",
       "48838  1.730342  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  1.250871   \n",
       "48839  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.759484   \n",
       "48840  0.181564  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.003732   \n",
       "48841 -0.592825  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.071019   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "34192  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "...                                                  ...            ...   \n",
       "48827  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "48829  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48834  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48835  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48838  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "\n",
       "                            marital-status  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34195  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34196  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34200  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34203  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48829  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48835  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48837  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48840  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34190  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34193  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34197  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48829  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48830  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48832  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "48833  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "48834  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48835  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48837  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48839  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34191  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34194  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34198  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34202  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48834  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48840  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "       capitalgain  capitalloss  hoursperweek  \\\n",
       "34189    -0.271118    -0.206016      1.172925   \n",
       "34190    -0.271118    -0.206016      0.053470   \n",
       "34191    -0.271118    -0.206016     -2.185441   \n",
       "34192    -0.271118    -0.206016      0.053470   \n",
       "34193    -0.271118    -0.206016      0.053470   \n",
       "34194    -0.271118    -0.206016     -1.065986   \n",
       "34195    -0.271118    -0.206016      0.053470   \n",
       "34196    -0.271118    -0.206016     -1.065986   \n",
       "34197    -0.271118    -0.206016     -2.185441   \n",
       "34198    -0.271118    -0.206016      0.053470   \n",
       "34199    -0.271118    -0.206016      0.053470   \n",
       "34200    -0.271118    -0.206016      0.053470   \n",
       "34201    -0.271118    -0.206016      0.053470   \n",
       "34202    -0.271118    -0.206016     -1.065986   \n",
       "34203    -0.271118    -0.206016      1.172925   \n",
       "...            ...          ...           ...   \n",
       "48827    -0.271118    -0.206016     -1.065986   \n",
       "48828    -0.271118    -0.206016      0.053470   \n",
       "48829    -0.271118    -0.206016      0.053470   \n",
       "48830    -0.271118    -0.206016      0.053470   \n",
       "48831    -0.271118    -0.206016      0.053470   \n",
       "48832    -0.271118    -0.206016      1.172925   \n",
       "48833    -0.271118    -0.206016      0.053470   \n",
       "48834    -0.271118    -0.206016      0.053470   \n",
       "48835    -0.271118    -0.206016      0.053470   \n",
       "48836    -0.271118    -0.206016      0.053470   \n",
       "48837    -0.271118    -0.206016      0.053470   \n",
       "48838    -0.271118    -0.206016      0.053470   \n",
       "48839    -0.271118    -0.206016      1.172925   \n",
       "48840     2.380648    -0.206016      0.053470   \n",
       "48841    -0.271118    -0.206016      1.172925   \n",
       "\n",
       "                                          native-country  class  \n",
       "34189  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34192  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34193  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34195  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34198  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34201  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "...                                                  ...    ...  \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48830  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48832  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48833  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48838  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "numeric-amazon",
   "metadata": {},
   "source": [
    "## 6. `variance_threshold` and `variance` parameter\n",
    "There are two choices for `variance_threshold` parameter:\n",
    "* `True`: filtering numerical columns whose variance is less than the `variance` value.\n",
    "* `False`: do nothing\n",
    "\n",
    "The default `variance_threshold` is False.\n",
    "\n",
    "The default `variance` is 0.0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "tender-yugoslavia",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                variance_threshold=True, variance=6.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "falling-attempt",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.064247</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.009237</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.246964</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.428035</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.412302</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.901345</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.279485</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.189970</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.365494</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.286491</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.862254</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.458800</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.639397</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.146086</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.644143</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.151677</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.382480</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.407759</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153272</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.323123</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.070743</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.761452</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.367482</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.428648</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.817212</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.753877</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.311551</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.720198</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.608356</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.193092</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.113276</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                           workclass    fnlwgt  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.064247   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.009237   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.246964   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.428035   \n",
       "4      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.412302   \n",
       "5      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.901345   \n",
       "6      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.279485   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.189970   \n",
       "8      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.365494   \n",
       "9      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.286491   \n",
       "10     [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.862254   \n",
       "11     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.458800   \n",
       "12     [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.639397   \n",
       "13     [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.146086   \n",
       "14     [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.644143   \n",
       "...                                              ...       ...   \n",
       "34174  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.151677   \n",
       "34175  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.382480   \n",
       "34176  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.407759   \n",
       "34177  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153272   \n",
       "34178  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.323123   \n",
       "34179  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.070743   \n",
       "34180  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.761452   \n",
       "34181  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.367482   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.428648   \n",
       "34183  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.817212   \n",
       "34184  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.753877   \n",
       "34185  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.311551   \n",
       "34186  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0] -0.720198   \n",
       "34187  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.608356   \n",
       "34188  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.113276   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "2      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "4      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "5      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "10     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "11     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "...                                                  ...            ...   \n",
       "34174  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34175  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34178  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34181  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34185  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34186  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -1.193092   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "\n",
       "                            marital-status  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "6      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34175  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34177  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34179  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34180  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34187  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "2      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "3      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "4      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "5      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "6      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "8      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "13     [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "14     [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "34174  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34175  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34177  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34178  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34182  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34183  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34184  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34186  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34187  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34188  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "1      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "3      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "4      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "5      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "6      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "7      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "9      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "10     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "11     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "12     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "14     [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "34174  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34176  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34178  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34179  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34180  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34181  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34182  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34183  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34184  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34185  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34186  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34187  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34188  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "                                          native-country  class  \n",
       "0      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "1      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "2      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "3      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "4      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "5      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "6      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "7      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "8      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "9      [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "10     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "11     [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "12     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "13     [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "14     [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "...                                                  ...    ...  \n",
       "34174  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34175  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34176  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34177  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34178  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34179  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34180  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34181  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34182  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34183  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34184  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34185  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34186  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34187  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34188  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "\n",
       "[34189 rows x 11 columns]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "id": "pediatric-allergy",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.704744</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.275191</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.147766</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.655990</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.818617</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.335985</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.197621</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.407546</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.149067</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.020718</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.342117</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-1.968313</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.376300</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.987078</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.042399</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.030259</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.382011</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.332483</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.549787</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...</td>\n",
       "      <td>0.357352</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.978500</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>0.744962</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.153595</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.910723</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-0.948722</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>2.377888</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.531605</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.515021</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.520184</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.527612</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>0.244809</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.250871</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>-0.417870</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>1.759484</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>-1.003732</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]</td>\n",
       "      <td>-0.071019</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>1.132573</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>[0.0, 1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0]</td>\n",
       "      <td>[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 11 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                           workclass    fnlwgt  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.704744   \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.275191   \n",
       "34191  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.147766   \n",
       "34192  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.655990   \n",
       "34193  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.818617   \n",
       "34194  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.335985   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  0.197621   \n",
       "34196  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.407546   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.149067   \n",
       "34198  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.020718   \n",
       "34199  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.342117   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0] -0.376300   \n",
       "34201  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.987078   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  0.042399   \n",
       "34203  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.382011   \n",
       "...                                              ...       ...   \n",
       "48827  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.332483   \n",
       "48828  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.549787   \n",
       "48829  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.978500   \n",
       "48830  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.153595   \n",
       "48831  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.910723   \n",
       "48832  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -0.948722   \n",
       "48833  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  2.377888   \n",
       "48834  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.531605   \n",
       "48835  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  1.515021   \n",
       "48836  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.527612   \n",
       "48837  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  0.244809   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]  1.250871   \n",
       "48839  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]  1.759484   \n",
       "48840  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] -1.003732   \n",
       "48841  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] -0.071019   \n",
       "\n",
       "                                               education  education-num  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "34192  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...      -1.968313   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...      -0.030259   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "...                                                  ...            ...   \n",
       "48827  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ...       0.357352   \n",
       "48829  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...       0.744962   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48834  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48835  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.520184   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48838  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...      -0.417870   \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...       1.132573   \n",
       "\n",
       "                            marital-status  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34195  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34196  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "34200  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "34203  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]   \n",
       "...                                    ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48829  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48835  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48837  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48840  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0]   \n",
       "\n",
       "                                              occupation  \\\n",
       "34189  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34190  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34191  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34193  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34194  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34197  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34198  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34199  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34201  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "34202  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "...                                                  ...   \n",
       "48827  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48829  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48830  [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48832  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, ...   \n",
       "48833  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, ...   \n",
       "48834  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48835  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48837  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48839  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   \n",
       "\n",
       "                         relationship                       race         sex  \\\n",
       "34189  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34190  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34191  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34192  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34193  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34194  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34195  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34196  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34198  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34200  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34201  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "34202  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "34203  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "...                               ...                        ...         ...   \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48828  [0.0, 0.0, 0.0, 0.0, 1.0, 0.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48830  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48831  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48832  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48833  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48834  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48836  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [0.0, 1.0]   \n",
       "48838  [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]  [0.0, 1.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48839  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48840  [0.0, 0.0, 0.0, 1.0, 0.0, 0.0]  [0.0, 0.0, 1.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "48841  [0.0, 1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0, 0.0, 0.0, 0.0]  [1.0, 0.0]   \n",
       "\n",
       "                                          native-country  class  \n",
       "34189  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34190  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34191  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34192  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "34193  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34194  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34195  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34196  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34197  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34198  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34199  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34200  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34201  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34202  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "34203  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "...                                                  ...    ...  \n",
       "48827  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48828  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48829  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48830  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48831  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48832  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48833  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48834  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48835  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48836  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48837  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48838  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48839  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48840  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...  <=50K  \n",
       "48841  [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...   >50K  \n",
       "\n",
       "[14653 rows x 11 columns]"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "every-malpractice",
   "metadata": {},
   "source": [
    "## 7. `num_scaling` parameter\n",
    "There are three choices for `num_scaling` parameter:\n",
    "* `standardize`: standarding each numerical column with mean value and std value of this column. The transformation is (x - mean) / std.\n",
    "* `minmax`: scaling each numerical column with min value and max value of this column. The transformation is (x - min) / (max - min)\n",
    "* `maxabs`: scaling each numerical column with max absolute value of this column. The transformation is x / maxabs.\n",
    "\n",
    "The default `num_scaling` is `standardize`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "id": "adjacent-shepherd",
   "metadata": {},
   "outputs": [],
   "source": [
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, target=\"class\", \n",
    "                                                cat_encoding='no_encoding',\n",
    "                                                num_scaling=\"minmax\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "labeled-survivor",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.50</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.044302</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.048238</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.138113</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.151068</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.400000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.221488</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.184932</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.866667</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.100448</td>\n",
       "      <td>9th</td>\n",
       "      <td>0.266667</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.134036</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.022749</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.866667</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>1.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.099947</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.182135</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>1.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>0.25</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.087619</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.074698</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.130896</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.733333</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.074359</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.666667</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.109592</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.093079</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.091271</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>?</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.109478</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.143562</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.400000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.115383</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.065966</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>0.094152</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.018231</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.061976</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.066508</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.142734</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>1.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>0.00</td>\n",
       "      <td>?</td>\n",
       "      <td>0.068917</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.50</td>\n",
       "      <td>?</td>\n",
       "      <td>0.163970</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.400000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>?</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.200094</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.50         State-gov  0.044302     Bachelors       0.800000   \n",
       "1      0.75  Self-emp-not-inc  0.048238     Bachelors       0.800000   \n",
       "2      0.50           Private  0.138113       HS-grad       0.533333   \n",
       "3      0.75           Private  0.151068          11th       0.400000   \n",
       "4      0.25           Private  0.221488     Bachelors       0.800000   \n",
       "5      0.50           Private  0.184932       Masters       0.866667   \n",
       "6      0.75           Private  0.100448           9th       0.266667   \n",
       "7      0.75  Self-emp-not-inc  0.134036       HS-grad       0.533333   \n",
       "8      0.25           Private  0.022749       Masters       0.866667   \n",
       "9      0.50           Private  0.099947     Bachelors       0.800000   \n",
       "10     0.50           Private  0.182135  Some-college       0.600000   \n",
       "11     0.25         State-gov  0.087619     Bachelors       0.800000   \n",
       "12     0.00           Private  0.074698     Bachelors       0.800000   \n",
       "13     0.25           Private  0.130896    Assoc-acdm       0.733333   \n",
       "14     0.50           Private  0.074359     Assoc-voc       0.666667   \n",
       "...     ...               ...       ...           ...            ...   \n",
       "34174  0.50           Private  0.109592  Some-college       0.600000   \n",
       "34175  0.75           Private  0.093079       HS-grad       0.533333   \n",
       "34176  1.00           Private  0.091271       HS-grad       0.533333   \n",
       "34177  1.00           Private  0.109478     Bachelors       0.800000   \n",
       "34178  0.00           Private  0.143562          11th       0.400000   \n",
       "34179  0.75           Private  0.115383  Some-college       0.600000   \n",
       "34180  0.00           Private  0.065966  Some-college       0.600000   \n",
       "34181  0.75      Self-emp-inc  0.094152  Some-college       0.600000   \n",
       "34182  1.00  Self-emp-not-inc  0.018231       HS-grad       0.533333   \n",
       "34183  0.75         Local-gov  0.061976     Bachelors       0.800000   \n",
       "34184  1.00           Private  0.066508       HS-grad       0.533333   \n",
       "34185  0.50           Private  0.142734       HS-grad       0.533333   \n",
       "34186  0.00                 ?  0.068917       HS-grad       0.533333   \n",
       "34187  0.50                 ?  0.163970          11th       0.400000   \n",
       "34188  0.00           Private  0.200094       HS-grad       0.533333   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married                  ?       Own-child   \n",
       "34187     Married-civ-spouse                  ?            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male         0.25         0.00          0.50   \n",
       "1                   White    Male         0.00         0.00          0.00   \n",
       "2                   White    Male         0.00         0.00          0.50   \n",
       "3                   Black    Male         0.00         0.00          0.50   \n",
       "4                   Black  Female         0.00         0.00          0.50   \n",
       "5                   White  Female         0.00         0.00          0.50   \n",
       "6                   Black  Female         0.00         0.00          0.00   \n",
       "7                   White    Male         0.00         0.00          0.50   \n",
       "8                   White  Female         1.00         0.00          0.75   \n",
       "9                   White    Male         0.50         0.00          0.50   \n",
       "10                  Black    Male         0.00         0.00          1.00   \n",
       "11     Asian-Pac-Islander    Male         0.00         0.00          0.50   \n",
       "12                  White  Female         0.00         0.00          0.25   \n",
       "13                  Black    Male         0.00         0.00          0.75   \n",
       "14     Asian-Pac-Islander    Male         0.00         0.00          0.50   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male         0.00         0.00          0.50   \n",
       "34175               White    Male         0.00         0.00          0.50   \n",
       "34176               Black    Male         0.00         0.00          0.50   \n",
       "34177               White  Female         0.00         0.00          0.25   \n",
       "34178               White    Male         0.00         0.00          0.00   \n",
       "34179               White  Female         0.00         0.00          0.25   \n",
       "34180  Asian-Pac-Islander    Male         0.00         0.00          0.00   \n",
       "34181               White    Male         0.00         0.75          0.50   \n",
       "34182               White    Male         0.00         0.00          0.25   \n",
       "34183               White    Male         0.00         0.00          0.75   \n",
       "34184               Black    Male         0.00         0.00          0.50   \n",
       "34185               White    Male         0.00         1.00          0.50   \n",
       "34186               White  Female         0.00         0.00          0.50   \n",
       "34187               White  Female         0.00         0.00          0.00   \n",
       "34188               White    Male         0.00         0.00          0.50   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14                 ?   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176              ?   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "closing-speaking",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.170866</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.50</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.029210</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.109872</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.666667</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.75</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.167378</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.179013</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.096406</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.134583</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.221148</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>0.25</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.131109</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.118962</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.095967</td>\n",
       "      <td>9th</td>\n",
       "      <td>0.266667</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.093522</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.262611</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>0.00</td>\n",
       "      <td>?</td>\n",
       "      <td>0.123478</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.600000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>?</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.021567</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.733333</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.144232</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.159779</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.666667</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.190452</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.733333</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.109455</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.185603</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.052567</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.290572</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.230024</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.228838</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.866667</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.158193</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.137959</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.00</td>\n",
       "      <td>?</td>\n",
       "      <td>0.209939</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.533333</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>?</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.246328</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.048632</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.5</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>0.115363</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.800000</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.50  Self-emp-not-inc  0.170866  Some-college       0.600000   \n",
       "34190  0.50         State-gov  0.029210     Bachelors       0.800000   \n",
       "34191  0.00           Private  0.109872     Assoc-voc       0.666667   \n",
       "34192  0.75         State-gov  0.167378  Some-college       0.600000   \n",
       "34193  0.75           Private  0.179013       HS-grad       0.533333   \n",
       "34194  0.00           Private  0.096406  Some-college       0.600000   \n",
       "34195  0.25         Local-gov  0.134583  Some-college       0.600000   \n",
       "34196  0.25           Private  0.221148  Some-college       0.600000   \n",
       "34197  0.25         State-gov  0.131109     Bachelors       0.800000   \n",
       "34198  0.00           Private  0.118962  Some-college       0.600000   \n",
       "34199  0.25           Private  0.095967           9th       0.266667   \n",
       "34200  0.25         Local-gov  0.093522  Some-college       0.600000   \n",
       "34201  0.50           Private  0.262611  Some-college       0.600000   \n",
       "34202  0.00                 ?  0.123478  Some-college       0.600000   \n",
       "34203  0.50           Private  0.021567    Assoc-acdm       0.733333   \n",
       "...     ...               ...       ...           ...            ...   \n",
       "48827  0.75           Private  0.144232       HS-grad       0.533333   \n",
       "48828  0.50           Private  0.159779     Assoc-voc       0.666667   \n",
       "48829  1.00           Private  0.190452    Assoc-acdm       0.733333   \n",
       "48830  0.25           Private  0.109455       HS-grad       0.533333   \n",
       "48831  0.75           Private  0.185603       HS-grad       0.533333   \n",
       "48832  1.00           Private  0.052567       HS-grad       0.533333   \n",
       "48833  0.25           Private  0.290572       HS-grad       0.533333   \n",
       "48834  0.00           Private  0.230024       HS-grad       0.533333   \n",
       "48835  0.75         Local-gov  0.228838       Masters       0.866667   \n",
       "48836  0.25           Private  0.158193     Bachelors       0.800000   \n",
       "48837  0.50           Private  0.137959     Bachelors       0.800000   \n",
       "48838  1.00                 ?  0.209939       HS-grad       0.533333   \n",
       "48839  0.50           Private  0.246328     Bachelors       0.800000   \n",
       "48840  0.50           Private  0.048632     Bachelors       0.800000   \n",
       "48841  0.25      Self-emp-inc  0.115363     Bachelors       0.800000   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married                  ?       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed                  ?  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male          0.0          0.0          0.75   \n",
       "34190               White    Male          0.0          0.0          0.50   \n",
       "34191               White  Female          0.0          0.0          0.00   \n",
       "34192               White    Male          0.0          0.0          0.50   \n",
       "34193               White    Male          0.0          0.0          0.50   \n",
       "34194               White  Female          0.0          0.0          0.25   \n",
       "34195               White    Male          0.0          0.0          0.50   \n",
       "34196               Black  Female          0.0          0.0          0.25   \n",
       "34197               White  Female          0.0          0.0          0.00   \n",
       "34198               White    Male          0.0          0.0          0.50   \n",
       "34199               White    Male          0.0          0.0          0.50   \n",
       "34200               White  Female          0.0          0.0          0.50   \n",
       "34201               White    Male          0.0          0.0          0.50   \n",
       "34202               White  Female          0.0          0.0          0.25   \n",
       "34203               White    Male          0.0          0.0          0.75   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female          0.0          0.0          0.25   \n",
       "48828               Black  Female          0.0          0.0          0.50   \n",
       "48829               White    Male          0.0          0.0          0.50   \n",
       "48830               White    Male          0.0          0.0          0.50   \n",
       "48831               White    Male          0.0          0.0          0.50   \n",
       "48832               White    Male          0.0          0.0          0.75   \n",
       "48833               White    Male          0.0          0.0          0.50   \n",
       "48834               White  Female          0.0          0.0          0.50   \n",
       "48835               White    Male          0.0          0.0          0.50   \n",
       "48836               White    Male          0.0          0.0          0.50   \n",
       "48837               White  Female          0.0          0.0          0.50   \n",
       "48838               Black    Male          0.0          0.0          0.50   \n",
       "48839               White    Male          0.0          0.0          0.75   \n",
       "48840  Asian-Pac-Islander    Male          0.5          0.0          0.50   \n",
       "48841               White    Male          0.0          0.0          0.75   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "following-architect",
   "metadata": {},
   "source": [
    "## 8. `include_operators` and `exclude_operators` parameter\n",
    "The `include_operators` indicates which operator must be included in the cleaning pipeline. It is a list. For example: \n",
    "* `['one_hot', 'minmax', 'median', 'most_frequent']`\n",
    "\n",
    "The `exclude_operators` indicates which operator must be excluded in the cleaning pipeline. It has the same format with `include_operators`.\n",
    "\n",
    "The valid choices for `include_operators` and `exclude_operators`:\n",
    "* `one_hot`\n",
    "* `constant`\n",
    "* `most_frequent`\n",
    "* `drop`\n",
    "* `mean`\n",
    "* `median`\n",
    "* `standardize`\n",
    "* `minmax`\n",
    "* `maxabs`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ethical-warrant",
   "metadata": {},
   "source": [
    "## 9. `customized_cat_pipeline` and `customized_num_pipeline` parameter\n",
    "Experienced users can specify their own `customized_cat_pipeline` and `customized_num_pipeline`. The two parameters are lists including dictionaries of each component. Each compontent is also a dictionary including the name of specified operator and related parameters. For example: \n",
    "* `[\n",
    "    {\"cat_imputation\": {\"operator\": 'constant', \"cat_null_value\": ['?'], \"fill_val\": \"Hahahaha!!!!!\"}},\n",
    "]\n",
    "`\n",
    "\n",
    "Users can also specifiy their own operators. They just need to define a typical class with the `__init__` function, the `fit`, `transform` and `fit_transform` functions. When using them, the name of the class can be put at the operator's position."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "id": "amended-diploma",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Any, Union\n",
    "import dask.dataframe as dd\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "class MaxAbsScaler:\n",
    "    def __init__(self) -> None:\n",
    "        self.name = \"minmaxScaler\"\n",
    "\n",
    "    def fit(self,\n",
    "            df: pd.Series) -> Any:\n",
    "        self.maxabs = df.abs().max()\n",
    "        return self\n",
    "\n",
    "    def transform(self,\n",
    "            df: pd.Series) -> pd.Series:\n",
    "        result = df.map(self.compute_val)\n",
    "        return result\n",
    "\n",
    "    def fit_transform(self,\n",
    "            df: pd.Series) -> pd.Series:\n",
    "        return  self.fit(df).transform(df)\n",
    "\n",
    "    def compute_val(self, val):\n",
    "        return val / self.maxabs\n",
    "\n",
    "customized_cat_pipeline = [\n",
    "    {\"cat_imputation\": {\"operator\": 'constant', \"cat_null_value\": ['?'], \"fill_val\": \"Hahahaha!!!!!\"}},\n",
    "]\n",
    "customized_num_pipeline = [\n",
    "    {\"num_scaling\": {\"operator\": MaxAbsScaler}},\n",
    "]\n",
    "cleaned_training_df, cleaned_test_df = clean_ml(training_df, test_df, customized_cat_pipeline=customized_cat_pipeline, customized_num_pipeline=customized_num_pipeline)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "id": "weekly-productivity",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.50</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.052210</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.056113</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.145245</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.158093</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.4375</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.227930</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Wife</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>Cuba</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.191676</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.8750</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.107891</td>\n",
       "      <td>9th</td>\n",
       "      <td>0.3125</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>Jamaica</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.141201</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.030835</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.8750</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>1.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.107394</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.188902</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>1.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>0.25</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.095168</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>India</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.082354</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.138087</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.7500</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.082018</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.6875</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34174</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.116960</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34175</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.100584</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34176</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.098790</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34177</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.116847</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34178</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.150649</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.4375</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34179</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.122702</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Protective-serv</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34180</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.073694</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>India</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34181</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>0.101648</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34182</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.026354</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34183</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.069738</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34184</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.074232</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Husband</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34185</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.149828</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>1.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>El-Salvador</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34186</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>0.076621</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34187</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>0.170887</td>\n",
       "      <td>11th</td>\n",
       "      <td>0.4375</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>Wife</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34188</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.206713</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Machine-op-inspct</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.00</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>34189 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        age         workclass    fnlwgt     education  education-num  \\\n",
       "0      0.50         State-gov  0.052210     Bachelors         0.8125   \n",
       "1      0.75  Self-emp-not-inc  0.056113     Bachelors         0.8125   \n",
       "2      0.50           Private  0.145245       HS-grad         0.5625   \n",
       "3      0.75           Private  0.158093          11th         0.4375   \n",
       "4      0.25           Private  0.227930     Bachelors         0.8125   \n",
       "5      0.50           Private  0.191676       Masters         0.8750   \n",
       "6      0.75           Private  0.107891           9th         0.3125   \n",
       "7      0.75  Self-emp-not-inc  0.141201       HS-grad         0.5625   \n",
       "8      0.25           Private  0.030835       Masters         0.8750   \n",
       "9      0.50           Private  0.107394     Bachelors         0.8125   \n",
       "10     0.50           Private  0.188902  Some-college         0.6250   \n",
       "11     0.25         State-gov  0.095168     Bachelors         0.8125   \n",
       "12     0.00           Private  0.082354     Bachelors         0.8125   \n",
       "13     0.25           Private  0.138087    Assoc-acdm         0.7500   \n",
       "14     0.50           Private  0.082018     Assoc-voc         0.6875   \n",
       "...     ...               ...       ...           ...            ...   \n",
       "34174  0.50           Private  0.116960  Some-college         0.6250   \n",
       "34175  0.75           Private  0.100584       HS-grad         0.5625   \n",
       "34176  1.00           Private  0.098790       HS-grad         0.5625   \n",
       "34177  1.00           Private  0.116847     Bachelors         0.8125   \n",
       "34178  0.00           Private  0.150649          11th         0.4375   \n",
       "34179  0.75           Private  0.122702  Some-college         0.6250   \n",
       "34180  0.00           Private  0.073694  Some-college         0.6250   \n",
       "34181  0.75      Self-emp-inc  0.101648  Some-college         0.6250   \n",
       "34182  1.00  Self-emp-not-inc  0.026354       HS-grad         0.5625   \n",
       "34183  0.75         Local-gov  0.069738     Bachelors         0.8125   \n",
       "34184  1.00           Private  0.074232       HS-grad         0.5625   \n",
       "34185  0.50           Private  0.149828       HS-grad         0.5625   \n",
       "34186  0.00     Hahahaha!!!!!  0.076621       HS-grad         0.5625   \n",
       "34187  0.50     Hahahaha!!!!!  0.170887          11th         0.4375   \n",
       "34188  0.00           Private  0.206713       HS-grad         0.5625   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "0              Never-married       Adm-clerical   Not-in-family   \n",
       "1         Married-civ-spouse    Exec-managerial         Husband   \n",
       "2                   Divorced  Handlers-cleaners   Not-in-family   \n",
       "3         Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "4         Married-civ-spouse     Prof-specialty            Wife   \n",
       "5         Married-civ-spouse    Exec-managerial            Wife   \n",
       "6      Married-spouse-absent      Other-service   Not-in-family   \n",
       "7         Married-civ-spouse    Exec-managerial         Husband   \n",
       "8              Never-married     Prof-specialty   Not-in-family   \n",
       "9         Married-civ-spouse    Exec-managerial         Husband   \n",
       "10        Married-civ-spouse    Exec-managerial         Husband   \n",
       "11        Married-civ-spouse     Prof-specialty         Husband   \n",
       "12             Never-married       Adm-clerical       Own-child   \n",
       "13             Never-married              Sales   Not-in-family   \n",
       "14        Married-civ-spouse       Craft-repair         Husband   \n",
       "...                      ...                ...             ...   \n",
       "34174     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34175              Separated  Handlers-cleaners   Not-in-family   \n",
       "34176     Married-civ-spouse      Other-service         Husband   \n",
       "34177               Divorced     Prof-specialty   Not-in-family   \n",
       "34178          Never-married      Other-service       Own-child   \n",
       "34179               Divorced    Protective-serv       Unmarried   \n",
       "34180          Never-married              Sales  Other-relative   \n",
       "34181     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34182     Married-civ-spouse       Craft-repair         Husband   \n",
       "34183     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34184     Married-civ-spouse      Other-service         Husband   \n",
       "34185          Never-married              Sales   Not-in-family   \n",
       "34186          Never-married      Hahahaha!!!!!       Own-child   \n",
       "34187     Married-civ-spouse      Hahahaha!!!!!            Wife   \n",
       "34188     Married-civ-spouse  Machine-op-inspct         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "0                   White    Male         0.25         0.00          0.50   \n",
       "1                   White    Male         0.00         0.00          0.00   \n",
       "2                   White    Male         0.00         0.00          0.50   \n",
       "3                   Black    Male         0.00         0.00          0.50   \n",
       "4                   Black  Female         0.00         0.00          0.50   \n",
       "5                   White  Female         0.00         0.00          0.50   \n",
       "6                   Black  Female         0.00         0.00          0.00   \n",
       "7                   White    Male         0.00         0.00          0.50   \n",
       "8                   White  Female         1.00         0.00          0.75   \n",
       "9                   White    Male         0.50         0.00          0.50   \n",
       "10                  Black    Male         0.00         0.00          1.00   \n",
       "11     Asian-Pac-Islander    Male         0.00         0.00          0.50   \n",
       "12                  White  Female         0.00         0.00          0.25   \n",
       "13                  Black    Male         0.00         0.00          0.75   \n",
       "14     Asian-Pac-Islander    Male         0.00         0.00          0.50   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "34174               White    Male         0.00         0.00          0.50   \n",
       "34175               White    Male         0.00         0.00          0.50   \n",
       "34176               Black    Male         0.00         0.00          0.50   \n",
       "34177               White  Female         0.00         0.00          0.25   \n",
       "34178               White    Male         0.00         0.00          0.00   \n",
       "34179               White  Female         0.00         0.00          0.25   \n",
       "34180  Asian-Pac-Islander    Male         0.00         0.00          0.00   \n",
       "34181               White    Male         0.00         0.75          0.50   \n",
       "34182               White    Male         0.00         0.00          0.25   \n",
       "34183               White    Male         0.00         0.00          0.75   \n",
       "34184               Black    Male         0.00         0.00          0.50   \n",
       "34185               White    Male         0.00         1.00          0.50   \n",
       "34186               White  Female         0.00         0.00          0.50   \n",
       "34187               White  Female         0.00         0.00          0.00   \n",
       "34188               White    Male         0.00         0.00          0.50   \n",
       "\n",
       "      native-country  class  \n",
       "0      United-States  <=50K  \n",
       "1      United-States  <=50K  \n",
       "2      United-States  <=50K  \n",
       "3      United-States  <=50K  \n",
       "4               Cuba  <=50K  \n",
       "5      United-States  <=50K  \n",
       "6            Jamaica  <=50K  \n",
       "7      United-States   >50K  \n",
       "8      United-States   >50K  \n",
       "9      United-States   >50K  \n",
       "10     United-States   >50K  \n",
       "11             India   >50K  \n",
       "12     United-States  <=50K  \n",
       "13     United-States  <=50K  \n",
       "14     Hahahaha!!!!!   >50K  \n",
       "...              ...    ...  \n",
       "34174  United-States   >50K  \n",
       "34175  United-States  <=50K  \n",
       "34176  Hahahaha!!!!!   >50K  \n",
       "34177  United-States  <=50K  \n",
       "34178  United-States  <=50K  \n",
       "34179  United-States  <=50K  \n",
       "34180          India  <=50K  \n",
       "34181  United-States   >50K  \n",
       "34182  United-States  <=50K  \n",
       "34183  United-States   >50K  \n",
       "34184  United-States  <=50K  \n",
       "34185    El-Salvador  <=50K  \n",
       "34186  United-States  <=50K  \n",
       "34187  United-States  <=50K  \n",
       "34188  United-States  <=50K  \n",
       "\n",
       "[34189 rows x 15 columns]"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_training_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "wired-certification",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>workclass</th>\n",
       "      <th>fnlwgt</th>\n",
       "      <th>education</th>\n",
       "      <th>education-num</th>\n",
       "      <th>marital-status</th>\n",
       "      <th>occupation</th>\n",
       "      <th>relationship</th>\n",
       "      <th>race</th>\n",
       "      <th>sex</th>\n",
       "      <th>capitalgain</th>\n",
       "      <th>capitalloss</th>\n",
       "      <th>hoursperweek</th>\n",
       "      <th>native-country</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>34189</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Self-emp-not-inc</td>\n",
       "      <td>0.177726</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34190</th>\n",
       "      <td>0.50</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.037242</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34191</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.117237</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.6875</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34192</th>\n",
       "      <td>0.75</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.174267</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34193</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.185806</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34194</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.103883</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34195</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.141744</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34196</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.227593</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34197</th>\n",
       "      <td>0.25</td>\n",
       "      <td>State-gov</td>\n",
       "      <td>0.138299</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.00</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34198</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.126252</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34199</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.103447</td>\n",
       "      <td>9th</td>\n",
       "      <td>0.3125</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34200</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.101022</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34201</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.268713</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34202</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>0.130730</td>\n",
       "      <td>Some-college</td>\n",
       "      <td>0.6250</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34203</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.029663</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.7500</td>\n",
       "      <td>Married-spouse-absent</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48827</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.151313</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Separated</td>\n",
       "      <td>Priv-house-serv</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.25</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48828</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.166731</td>\n",
       "      <td>Assoc-voc</td>\n",
       "      <td>0.6875</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Unmarried</td>\n",
       "      <td>Black</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48829</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.197150</td>\n",
       "      <td>Assoc-acdm</td>\n",
       "      <td>0.7500</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48830</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.116824</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Handlers-cleaners</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48831</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.192341</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48832</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.060407</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Sales</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48833</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.296442</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Craft-repair</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48834</th>\n",
       "      <td>0.00</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.236395</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48835</th>\n",
       "      <td>0.75</td>\n",
       "      <td>Local-gov</td>\n",
       "      <td>0.235218</td>\n",
       "      <td>Masters</td>\n",
       "      <td>0.8750</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Other-service</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48836</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.165158</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Never-married</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48837</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.145092</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Not-in-family</td>\n",
       "      <td>White</td>\n",
       "      <td>Female</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48838</th>\n",
       "      <td>1.00</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>0.216476</td>\n",
       "      <td>HS-grad</td>\n",
       "      <td>0.5625</td>\n",
       "      <td>Widowed</td>\n",
       "      <td>Hahahaha!!!!!</td>\n",
       "      <td>Other-relative</td>\n",
       "      <td>Black</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48839</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.252564</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Prof-specialty</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48840</th>\n",
       "      <td>0.50</td>\n",
       "      <td>Private</td>\n",
       "      <td>0.056503</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Divorced</td>\n",
       "      <td>Adm-clerical</td>\n",
       "      <td>Own-child</td>\n",
       "      <td>Asian-Pac-Islander</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.5</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.50</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&lt;=50K</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48841</th>\n",
       "      <td>0.25</td>\n",
       "      <td>Self-emp-inc</td>\n",
       "      <td>0.122683</td>\n",
       "      <td>Bachelors</td>\n",
       "      <td>0.8125</td>\n",
       "      <td>Married-civ-spouse</td>\n",
       "      <td>Exec-managerial</td>\n",
       "      <td>Husband</td>\n",
       "      <td>White</td>\n",
       "      <td>Male</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.75</td>\n",
       "      <td>United-States</td>\n",
       "      <td>&gt;50K</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>14653 rows × 15 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        age         workclass    fnlwgt     education  education-num  \\\n",
       "34189  0.50  Self-emp-not-inc  0.177726  Some-college         0.6250   \n",
       "34190  0.50         State-gov  0.037242     Bachelors         0.8125   \n",
       "34191  0.00           Private  0.117237     Assoc-voc         0.6875   \n",
       "34192  0.75         State-gov  0.174267  Some-college         0.6250   \n",
       "34193  0.75           Private  0.185806       HS-grad         0.5625   \n",
       "34194  0.00           Private  0.103883  Some-college         0.6250   \n",
       "34195  0.25         Local-gov  0.141744  Some-college         0.6250   \n",
       "34196  0.25           Private  0.227593  Some-college         0.6250   \n",
       "34197  0.25         State-gov  0.138299     Bachelors         0.8125   \n",
       "34198  0.00           Private  0.126252  Some-college         0.6250   \n",
       "34199  0.25           Private  0.103447           9th         0.3125   \n",
       "34200  0.25         Local-gov  0.101022  Some-college         0.6250   \n",
       "34201  0.50           Private  0.268713  Some-college         0.6250   \n",
       "34202  0.00     Hahahaha!!!!!  0.130730  Some-college         0.6250   \n",
       "34203  0.50           Private  0.029663    Assoc-acdm         0.7500   \n",
       "...     ...               ...       ...           ...            ...   \n",
       "48827  0.75           Private  0.151313       HS-grad         0.5625   \n",
       "48828  0.50           Private  0.166731     Assoc-voc         0.6875   \n",
       "48829  1.00           Private  0.197150    Assoc-acdm         0.7500   \n",
       "48830  0.25           Private  0.116824       HS-grad         0.5625   \n",
       "48831  0.75           Private  0.192341       HS-grad         0.5625   \n",
       "48832  1.00           Private  0.060407       HS-grad         0.5625   \n",
       "48833  0.25           Private  0.296442       HS-grad         0.5625   \n",
       "48834  0.00           Private  0.236395       HS-grad         0.5625   \n",
       "48835  0.75         Local-gov  0.235218       Masters         0.8750   \n",
       "48836  0.25           Private  0.165158     Bachelors         0.8125   \n",
       "48837  0.50           Private  0.145092     Bachelors         0.8125   \n",
       "48838  1.00     Hahahaha!!!!!  0.216476       HS-grad         0.5625   \n",
       "48839  0.50           Private  0.252564     Bachelors         0.8125   \n",
       "48840  0.50           Private  0.056503     Bachelors         0.8125   \n",
       "48841  0.25      Self-emp-inc  0.122683     Bachelors         0.8125   \n",
       "\n",
       "              marital-status         occupation    relationship  \\\n",
       "34189     Married-civ-spouse       Craft-repair         Husband   \n",
       "34190     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34191          Never-married      Other-service       Own-child   \n",
       "34192     Married-civ-spouse    Exec-managerial         Husband   \n",
       "34193     Married-civ-spouse     Prof-specialty         Husband   \n",
       "34194          Never-married              Sales       Own-child   \n",
       "34195     Married-civ-spouse       Craft-repair  Other-relative   \n",
       "34196               Divorced       Adm-clerical       Unmarried   \n",
       "34197          Never-married     Prof-specialty   Not-in-family   \n",
       "34198              Separated      Other-service       Own-child   \n",
       "34199              Separated       Craft-repair   Not-in-family   \n",
       "34200               Divorced       Adm-clerical       Unmarried   \n",
       "34201     Married-civ-spouse       Craft-repair         Husband   \n",
       "34202          Never-married      Hahahaha!!!!!       Own-child   \n",
       "34203  Married-spouse-absent       Adm-clerical  Other-relative   \n",
       "...                      ...                ...             ...   \n",
       "48827              Separated    Priv-house-serv   Not-in-family   \n",
       "48828          Never-married       Adm-clerical       Unmarried   \n",
       "48829               Divorced     Prof-specialty   Not-in-family   \n",
       "48830     Married-civ-spouse  Handlers-cleaners         Husband   \n",
       "48831     Married-civ-spouse       Adm-clerical         Husband   \n",
       "48832     Married-civ-spouse              Sales         Husband   \n",
       "48833     Married-civ-spouse       Craft-repair         Husband   \n",
       "48834          Never-married      Other-service       Own-child   \n",
       "48835               Divorced      Other-service   Not-in-family   \n",
       "48836          Never-married     Prof-specialty       Own-child   \n",
       "48837               Divorced     Prof-specialty   Not-in-family   \n",
       "48838                Widowed      Hahahaha!!!!!  Other-relative   \n",
       "48839     Married-civ-spouse     Prof-specialty         Husband   \n",
       "48840               Divorced       Adm-clerical       Own-child   \n",
       "48841     Married-civ-spouse    Exec-managerial         Husband   \n",
       "\n",
       "                     race     sex  capitalgain  capitalloss  hoursperweek  \\\n",
       "34189               White    Male          0.0          0.0          0.75   \n",
       "34190               White    Male          0.0          0.0          0.50   \n",
       "34191               White  Female          0.0          0.0          0.00   \n",
       "34192               White    Male          0.0          0.0          0.50   \n",
       "34193               White    Male          0.0          0.0          0.50   \n",
       "34194               White  Female          0.0          0.0          0.25   \n",
       "34195               White    Male          0.0          0.0          0.50   \n",
       "34196               Black  Female          0.0          0.0          0.25   \n",
       "34197               White  Female          0.0          0.0          0.00   \n",
       "34198               White    Male          0.0          0.0          0.50   \n",
       "34199               White    Male          0.0          0.0          0.50   \n",
       "34200               White  Female          0.0          0.0          0.50   \n",
       "34201               White    Male          0.0          0.0          0.50   \n",
       "34202               White  Female          0.0          0.0          0.25   \n",
       "34203               White    Male          0.0          0.0          0.75   \n",
       "...                   ...     ...          ...          ...           ...   \n",
       "48827               White  Female          0.0          0.0          0.25   \n",
       "48828               Black  Female          0.0          0.0          0.50   \n",
       "48829               White    Male          0.0          0.0          0.50   \n",
       "48830               White    Male          0.0          0.0          0.50   \n",
       "48831               White    Male          0.0          0.0          0.50   \n",
       "48832               White    Male          0.0          0.0          0.75   \n",
       "48833               White    Male          0.0          0.0          0.50   \n",
       "48834               White  Female          0.0          0.0          0.50   \n",
       "48835               White    Male          0.0          0.0          0.50   \n",
       "48836               White    Male          0.0          0.0          0.50   \n",
       "48837               White  Female          0.0          0.0          0.50   \n",
       "48838               Black    Male          0.0          0.0          0.50   \n",
       "48839               White    Male          0.0          0.0          0.75   \n",
       "48840  Asian-Pac-Islander    Male          0.5          0.0          0.50   \n",
       "48841               White    Male          0.0          0.0          0.75   \n",
       "\n",
       "      native-country  class  \n",
       "34189  United-States  <=50K  \n",
       "34190  United-States   >50K  \n",
       "34191  United-States  <=50K  \n",
       "34192  United-States   >50K  \n",
       "34193  United-States  <=50K  \n",
       "34194  United-States  <=50K  \n",
       "34195  United-States  <=50K  \n",
       "34196  United-States  <=50K  \n",
       "34197  United-States  <=50K  \n",
       "34198  United-States  <=50K  \n",
       "34199  United-States  <=50K  \n",
       "34200  United-States  <=50K  \n",
       "34201  United-States  <=50K  \n",
       "34202  United-States  <=50K  \n",
       "34203  United-States  <=50K  \n",
       "...              ...    ...  \n",
       "48827  United-States  <=50K  \n",
       "48828  United-States  <=50K  \n",
       "48829  United-States  <=50K  \n",
       "48830  United-States  <=50K  \n",
       "48831  United-States  <=50K  \n",
       "48832  United-States  <=50K  \n",
       "48833  United-States  <=50K  \n",
       "48834  United-States  <=50K  \n",
       "48835  United-States  <=50K  \n",
       "48836  United-States  <=50K  \n",
       "48837  United-States  <=50K  \n",
       "48838  United-States  <=50K  \n",
       "48839  United-States  <=50K  \n",
       "48840  United-States  <=50K  \n",
       "48841  United-States   >50K  \n",
       "\n",
       "[14653 rows x 15 columns]"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cleaned_test_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "blond-diversity",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
